Archive

Archive for the ‘sed’ Category

Delete specific line numbers in a file – awk, sed

March 14, 2014 Leave a comment

Problem:
You want to delete certain lines in a file. Say 3,6,9,12 etc.

$ less file
first
second
third
fourth
fifth
sixth
seventh
eigth
nineth
tenth

Solution:
1. Using awk.

$ awk 'NR%3' file
first
second
fourth
fifth
seventh
eigth
tenth

My understanding of how it works.

$ man awk

NR The total number of input records seen so far.

$ awk 'NR' file 

Will print all the rows as NR returns a value which equates to true and thus gets printed.
If you want to only print the x row

$ awk 'NR == 2' file
second
$ awk 'NR == 5' file
fifth

The below example prints nothing. It is like saying print but the result is false so nothing gets printed.

$ awk '0' file

So what happens with NR%3 is that anything that is not 0 gets printed.
1%3 result is 1 it gets printed.
2%3 result is 2 it gets printed.
3%3 result is 0 it does not get printed.
4%3 result is 1 it gets printed.
and so on.

2. Using sed.

$ sed 'n;n;d' file
first
second
fourth
fifth
seventh
eigth
tenth

How I think it works.

$ sed 'n;n;d' file

n – move to the next line
n – move to the next line
d – delete that line.

I think that due to d repeating the next cycle, since we still have lines in the file, it starts over and repeats the same till EOF.

$ info sed

`n’
If auto-print is not disabled, print the pattern space, then,
regardless, replace the pattern space with the next line of input.
If there is no more input then `sed’ exits without processing any
more commands.

`d’
Delete the pattern space; immediately start next cycle.

NB: File must not contain blank lines.

Source:
http://www.unix.com/shell-programming-scripting/245088-how-delete-line-number-3-6-9-12-15-so.html

Advertisements
Categories: awk, Interesting, sed Tags:

sed – Convert to Title Case

December 2, 2013 3 comments

Problem:
I needed to ensure that text read from a file was in Title Case. Solution had to be sed 🙂

Solution:
1. This is what I came up with, then I googled and found much simpler solutions that work.

echo "noRTh weST, 0" |sed 's/^\([A-Za-z]\)\([A-Za-z]\+\)/\U\1\L\2/'
North weST, 0

2. A simpler solution

echo "noRTh weST, 0" | sed -e 's/.*/\L&/' -e 's/[a-z]*/\u&/g'
North West, 0

sed -e 's/.*/\L&/' - Convert all the characters to lowercase
sed -e 's/[a-z]*/\u&/g' -
Convert only the first character or each word to uppercase.

\L – Turn the replacement to lowercase until a `\U’ or `\E’ is found,
& – References the whole matched portion of the pattern space.
\u – Turn the next character to uppercase

3. Above solution will fail for the following cases

echo "nor'th we'ST, 0" | sed -e 's/.*/\L&/' -e 's/[a-z]*/\u&/g'
Nor'Th We'St, 0

echo "nor'th we'ST, 0" | sed -e 's/.*/\L&/' -e 's/[[:graph:]]*/\u&/g'
Nor'th We'st, 0

/[[:graph:]]/ – Non-blank character (excludes spaces, control characters, and similar)

Source:
http://bashscripts.org/forum/viewtopic.php?f=21&t=889
http://www.ruby-doc.org/core-1.9.3/Regexp.html

Categories: sed

sed – Replace only certain occurances

November 8, 2013 1 comment

Problem:
I needed to change only certain occurances.

Solution:
Best explained with an example.

To change the first one.

$ echo "little kittens on a little boat" | sed 's/little/big/'
big kittens on a little boat

To change the second one only.

$ echo "little kittens on a little boat" | sed 's/little/big/2'
little kittens on a big boat

To change all the rest starting from the second one.

$ echo "little kittens on a little boat eating little fish" | sed 's/little/big/2g'
little kittens on a big boat eating big fish
Categories: sed

sed – Add 0 values to records in an existing file

May 7, 2013 Leave a comment

Just noting this here for future reference.

Problem:
The full problem is stated here. This is just a simplified version.

Input:

N/A124 14 0.8 1051670971100000
N/A125 15 0.8 1051670971100001
N/A126 16 0.8 1051670971100002
N/A127 17 0.8 1051670971100003

Output:

N/A124000014 0.8 1051670971100000
N/A125000015 0.8 1051670971100001
N/A126000016 0.8 1051670971100002
N/A127000017 0.8 1051670971100003

Solution:
There were other solutions but this one with sed caught my attention.

$ echo "N/A124    14 0.8    1051670971100000
N/A125    15 0.8    1051670971100001
N/A126    16 0.8    1051670971100002
N/A127    17 0.8    1051670971100003" | \
sed ":a; s/^\(.\{6,11\}\) /\10/; ta;"
N/A124000014 0.8    1051670971100000
N/A125000015 0.8    1051670971100001
N/A126000016 0.8    1051670971100002
N/A127000017 0.8    1051670971100003

This is what I think sed is doing. (Please correct me if I’m wrong)
:a – Create a label
s/^\(.\{6,11\}\) /\10/ – Group the first 6 to 11 characters, Substitute the match (ie first 6 to 11 characters, with the captured match and a zero)
ta – If the substitution was successful then branch to label a

t label
If a s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script.

`\{I,J\}’
Matches between I and J, inclusive, sequences.

Source:
http://www.unix.com/shell-programming-scripting/223047-add-0-values-replace-empty-value.html

Categories: bash, sed Tags:

sed – Convert to upper case

November 28, 2012 1 comment

Problem:
I need to covert text to UPPER CASE using sed.

Solution:

user@computer:~$ echo "first,second,third" |\
sed 's/\([a-z]\+\),/\U\1,/'
FIRST,second,third

Source:
http://nixcraft.com/shell-scripting/15862-sed-convert-text-lower-upper-case.html

Categories: bash, sed

sed – exclude character in search

July 24, 2012 Leave a comment

Problem:
Needed to match characters in a string. Due to the greedy nature of regex in sed I had to be more specific with the search and include [a-z0-9] when matching the text. Fortunately, there is a much simpler way of doing this.

Solution:
Say you have an entry like the one below. (Copied from unix.com)

$less infile
GET /dynamic_branding_playlist.fmil?domain=915oGLbNZhb&pluginVersion=3.2.7_2.6&pubchannel=usa&sdk_ver=2.4.6.3&width=680&height=290&embeddedIn=http%3A%2F%2Fviewster.com%2Fsplash%2FOscar-Videos-1.aspx%3Futm_source%3Dadon_272024_113535_24905_24905%26utm_medium%3Dcpc%26utm_campaign%3DUSYME%26adv %3D573900%26req%3D5006e9ce1ca8b26347b88a7.1.825&sdk_url=http%3A%2F%2Fdivaag.vo.llnwd.net%2Fo42%2Fhtt p_only%2Fviewster_com%2Fv25%2Fyume%2F&viewport=42

You want to match the values of domain and sdk_version. This should work.

$sed 's/.*domain=\([^&]*\).*sdk_ver=\([^&]*\).*/\1 \2/' infile

What I learnt from this was the use of [^&]* which allows the regexp to match any character except the &. The longer way of matching would be [0-9a-zA-Z]*&.

Source:
http://www.unix.com/shell-programming-scripting/194189-extract-key-words-print-their-values.html

Categories: sed

bash – Trimming spaces from variables.

March 15, 2012 Leave a comment

Problem:
I was reading “settings” from a tab separated text file. The values I was getting had extra white spaces that were not needed. I needed to strip them out. Here is the sample of the text file I was using.

cat sample_file
this | is a sample |   file with |    unwanted spaces   | 

Solution:
After a quick search on google I got a few solutions. Here is what worked for me.

1. sed

cut -d"|" -f4 sample_file | sed 's/^ \+\| \+$//g'

2. awk

cut -d"|" -f4 sample_file | awk '{gsub(/^ +| +$/,"")}1'

Source:
http://stackoverflow.com/questions/369758/how-to-trim-whitespace-from-bash-variable

Categories: awk, bash, sed Tags: ,