Eikonal Blog

2011.08.17

eBooks and eBook Format Transformers

Sites


Articles


Devices and other readers

  • Amazon’s Kindle
  • barnes and Noble’s Nook
  • FBReader — e-book reader for Unix/Windows computers – http://www.fbreader.org/

eBook format transformers

Kindle blogs

PDF

2010.10.05

sed tricks

Filed under: scripting, transformers — Tags: , , — sandokan65 @ 15:58

These one-liners are collected from various sites and articles on web – see the list of Sources at the bottom of this posting.

  • Deleting all empty lines from the input file:
    sed ‘/^$/d’ 
  • In-place replacement:
    sed –i ‘/^$/d’ INPUTFILE
  • In-place replacement with backup of original file:
    sed –ibak ‘/^$/d’ INPUTFILE
  • In-place deletion of all occurences of a string in a file:
    sed –i ‘/WORDTOBEDELETED/d’
  • How to replace the first occurrence only (of a string match) in a file, using sed
    sed '0,/THISSTRING/s//TOTHATSTRING/' INPUTFILE
  • Append environment variable PATH with sed:
    sed -e '/^PATH/s/"$/:\/usr\/lib\/myprog\/bin"/g' -i /etc/environment
  • Remove all whitespace from beinning of lines:
    sed 's/^[ \t]*//g' foo
  • Deleting the / from all html files contained in current folder:
    sed -i ‘s/src=”\//src=”/g’ *.html
  • Greedy matching:
    % echo "foobar" | sed 's///g'
    bar
    
  • Non greedy matching:
    % echo "foobar" | sed 's/]*>//g'
    foobar
    

Sources:

References


Related here: Command line based text replace – https://eikonal.wordpress.com/2010/07/13/command-line-based-text-replace/.

Related here: Scripting languages – https://eikonal.wordpress.com/2010/06/15/awk-sed/ | Unix tricks – https://eikonal.wordpress.com/2011/02/15/unix-tricks/ | SED tricks – https://eikonal.wordpress.com/2010/10/05/sed-tricks/ | Memory of things disappearing > nmap stuff > getports.awk – https://eikonal.wordpress.com/2010/06/23/memory-of-things-disappearing-nmap-stuff-getports-awk/ | AWK – https://eikonal.wordpress.com/2011/09/30/awk/

2010.07.13

Command line based text replace

sed

  • sed 's/Mark Monre/Marc Monroe/' 1.txt > 2.txt
  • find ./* -type f -exec sed -i 's///g' {} \;

The “replace” command

  • Syntax:
    replace OLD-STRING NEW-STRING OUTPUT-FILE
  • Example:
    $ replace UNIX Linux  newfile
  • Example:
    $ cat /etc/passwd | replace : '|'
  • Partial support for regular expressions: \^ – matches start of line, and $ matches end of line.
  • Example: replace all IP address 192.168.1.2 start of line:
    $ replace \^192.168.1.2 192.168.5.10  newfile
  • a bash script, ‘fixer.sh’
    #!/bin/bash
    replace CHANGEFROM CHANGETO $1.tmp
    rm $1
    mv $1.tmp $1
    

    now run this command line:

    $ grep CHANGEFROM |cut -d':' -f1 |xargs -n 1 fixer.sh

    the results is that all files in the directory (or whatever you grep for) will be changed automagically.
    just make sure the grep doesn’t include the fixer script itself, or it will die half-way through changing when execute permissions are reset!


Perl


Sources:


Related: Regular expressions – https://eikonal.wordpress.com/2010/04/02/regular-expressions/ | Perl online – https://eikonal.wordpress.com/2010/02/15/perl-online/

2010.04.02

Regular expressions

Sites

Tools

Standalone tools:

Online testers:

Books

Tidbits

Sources: The above links.

  • [abc] – A single character: a, b or c
  • [^abc] – Any single character but a, b, or c
  • [a-z] – Any single character in the range a-z
  • [a-zA-Z] – Any single character in the range a-z or A-Z
  • ^ – Start of line
  • $ – End of line
  • \A – Start of string
  • \z – End of string
  • . – Any single character
  • \s – Any whitespace character
  • \S – Any non-whitespace character
  • \d – Any digit
  • \D – Any non-digit
  • \w – Any word character (letter, number, underscore)
  • \W – Any non-word character
  • \b – Any word boundary character
  • (…) – Capture everything enclosed
  • (a|b) – a or b
  • a? – Zero or one of a
  • a* – Zero or more of a
  • a+ – One or more of a
  • a{3} – Exactly 3 of a
  • a{3,} – 3 or more of a
  • a{3,6} – Between 3 and 6 of a
  • ^\s[ \t]*$ – Match a blank line
  • \d{2}-\d{5} – Validate an ID number consisting of 2 digits, a hyphen, and another 5 digits

Special common strings:

  • Personal Name: ^[\w\.\’]{2,}([\s][\w\.\’]{2,})+$
  • Username: ^[\w\d\_\.]{4,}$
  • Password at least 6 symbols: ^.{6,}$
  • Password or empty input: ^.{6,}$|^$
  • email: ^[\_]*([a-z0-9]+(\.|\_*)?)+@([a-z][a-z0-9\-]+(\.|\-*\.))+[a-z]{2,6}$
  • Email address: \b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b[A-z0-9_.%+-]+@[A-z0-9_.%+-]+\.[A-z]{2,4}
  • US phone: \W?\d{3}\W?\d{3}\W?\d{4}
  • US Phone number: ^\+?[\d\s]{3,}$
  • US Phone with code: ^\+?[\d\s]+\(?[\d\s]{10,}$
  • URL: \W?\d{3}\W?\d{3}\W?\d{4}\b\w+://(\w|-|\.|/)+(/|\b)
  • US Social Security Number (SSN): \d{3}-\d{2}-\d{4}
  • US ZIP: \d{5}(-\d{4})?
  • IP (v4) address: \b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b
  • IP (v4) address: \b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
  • IP (v4) address: ^(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]){3}$
  • IP (v4) address: \b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
  • IP (v4) address: \b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
  • IP (v6) address:
  • MAC address: ^([0-9a-fA-F][0-9a-fA-F]:){5}([0-9a-fA-F][0-9a-fA-F])$
  • Positive Integers: ^\d+$
  • Negative Integers: ^-\d+$
  • Integer: ^-{0,1}\d+$
  • Positive Number: ^\d*\.{0,1}\d+$
  • Negative Number: ^-\d*\.{0,1}\d+$
  • Positive Number or Negative Number: ^-{0,1}\d*\.{0,1}\d+$
  • Floating point number: [-+]?([0-9]*\.[0-9]+|[0-9]+)
  • Floating point number: [-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?
  • Roman number: ^(?i:(?=[MDCLXVI])((M{0,3})((C[DM])|(D?C{0,3}))?((X[LC])|(L?XX{0,2})|L)?((I[VX])|(V?(II{0,2}))|V)?))$
  • Domain Name: ^([a-zA-Z0-9]([a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,6}$
  • Domain Name: ^([a-z][a-z0-9\-]+(\.|\-*\.))+[a-z]{2,6}$
  • Windows File Name: (?i)^(?!^(PRN|AUX|CLOCK\$|NUL|CON|COM\d|LPT\d|\..*)(\..+)?$)[^\\\./:\*\?\”\|][^\\/:\*\?\”\|]{0,254}$
  • Date in format yyyy-MM-dd: (19|20)\d\d([- /.])(0[1-9]|1[012])\2(0[1-9]|[12][0-9]|3[01])
  • Date (dd mm yyyy, d/m/yyyy, etc.): ^([1-9]|0[1-9]|[12][0-9]|3[01])\D([1-9]|0[1-9]|1[012])\D(19[0-9][0-9]|20[0-9][0-9])$
  • Year 1900-2099: ^(19|20)[\d]{2,2}$

Related (here at this blog):
Command line based text replace – https://eikonal.wordpress.com/2010/07/13/command-line-based-text-replace/ |
Perl online – https://eikonal.wordpress.com/2010/02/15/perl-online/

2010.03.03

Document sharing sites

2010.02.15

Perl online

Hashes

Files

Chomp()

Control structures

Tidbits

Rename files

Alex Batko says (at http://www.cs.mcgill.ca/~abatko/computers/programming/perl/):

Here is a brilliant program for renaming one or more files according to a specified Perl expression. I found it on page 706 of Programming Perl (3rd edition).

#!/usr/bin/perl
$op = shift;
for( @ARGV ) {
    $was = $_;
    eval $op;
    die if $@;
    rename( $was, $_ ) unless $was eq $_;
}

In the code above, the second last line calls the built-in function “rename”, not the program itself (which is named “rename.pl”). Below are a few examples of use.

% rename.pl 's/\.htm/\.html/' *.htm         # append an 'l'
% rename.pl '$_ .= ".old"' *.html           # append '.old'
% rename.pl 'tr/A-Z/a-z/' *.HTML            # lowercase
% rename.pl 'y/A-Z/a-z/ unless /^Make/' *   # lowercase

Printing hashes

Starting with an input file with data in two columns separated by coma (,):

#/bin/perl -t

my %TempHash = ();
my $InputFile = shift;
print "Input file = ",$InputFile,"\n";

my ($line,$column1,$column2,);

#reading input file to generate hash
open (INPUTSTREAM, '<',  $InputFile) || die ("Could not open $InputFile");
while ( $line =  ) {
	chomp;
        #print $line;
	($column1, $column2) = split ',', $line;
        $TempHash{$column1}=$column2;
        #print $column1," ==> ",$TempHash{$column1};
}
close (INPUTSTREAM);

## printing hash - way #1
print "The following are in the DB: ",join(', ',values %TempHash),"\n";

## printing hash - way #2
while (($key, $value) = each %TempHash)
{
     print "$key ==> $value";
}

## printing hash - way #3
foreach $key (sort keys %TempHash){
   print "$key ==> $TempHash{$key}";
}

Removing white spaces

Sources:

# Declare the subroutines
sub trim($);
sub ltrim($);
sub rtrim($);

# Perl trim function to remove whitespace from the start and end of the string
sub trim($)
{
	my $string = shift;
	$string =~ s/^\s+//;
	$string =~ s/\s+$//;
	return $string;
}
# Left trim function to remove leading whitespace
sub ltrim($)
{
	my $string = shift;
	$string =~ s/^\s+//;
	return $string;
}
# Right trim function to remove trailing whitespace
sub rtrim($)
{
	my $string = shift;
	$string =~ s/\s+$//;
	return $string;
}

# Here is how to output the trimmed text "Hello world!"
print trim($string)."\n";
print ltrim($string)."\n";
print rtrim($string)."\n";


Related: Regular Expressions – https://eikonal.wordpress.com/2010/04/02/regular-expressions/ | Command line based text replace – https://eikonal.wordpress.com/2010/07/13/command-line-based-text-replace/

Create a free website or blog at WordPress.com.