Archive

Archive for the ‘sed’ Category

Change the last line in a file

February 12, 2019 Leave a comment

Problem:
I needed to remove a comma from the last line of a file.

Solution:
Learnt that I can specify the last line with sed.

# File with commas
$ less minerals.csv
"copper",
"bronze",
"gold",
"platinum",
$ sed '$ s/,//' minerals.csv
$ less minerals.csv
"copper",
"bronze",
"gold",
"platinum"

Addresses

Sed commands can be given with no addresses, in which case the command will be executed for all input lines; with one address, in which case the command will only be executed for input lines which match that address; or with two addresses, in which case the command will be executed for all input lines which match the inclusive range of lines starting from the first address and continuing to the second address. Three things to note about address ranges: the syntax is addr1,addr2 (i.e., the addresses are separated by a comma); the line which addr1 matched will always be accepted, even if addr2 selects an earlier line; and if addr2 is a regexp, it will not be tested against the line that addr1 matched.

After the address (or address-range), and before the command, a ! may be inserted, which specifies that the command shall only be executed if the address (or address-range) does not match.

The following address types are supported:

number
Match only the specified line number.

first~step
Match every step’th line starting with line first. For example, ”sed -n 1~2p” will print all the odd-numbered lines in the input stream, and the address 2~5 will match every fifth line, starting with the second. first can be zero; in this case, sed operates as if it were equal to step. (This is an extension.)
$
Match the last line.

/regexp/
Match lines matching the regular expression regexp.
\cregexpc
Match lines matching the regular expression regexp. The c may be any character.
GNU sed also supports some special 2-address forms:
0,addr2
Start out in “matched first address” state, until addr2 is found. This is similar to 1,addr2, except that if addr2 matches the very first line of input the 0,addr2 form will be at the end of its range, whereas the 1,addr2 form will still be at the beginning of its range. This works only when addr2 is a regular expression.
addr1,+N
Will match addr1 and the N lines following addr1.
addr1,~N
Will match addr1 and the lines following addr1 until the next line whose input line number is a multiple of N.

Source:
https://stackoverflow.com/questions/3576139/sed-remove-string-only-in-the-last-line-of-the-file
https://linux.die.net/man/1/sed

Advertisements
Categories: bash, sed

Delete specific line numbers in a file – awk, sed

March 14, 2014 Leave a comment

Problem:
You want to delete certain lines in a file. Say 3,6,9,12 etc.

$ less file
first
second
third
fourth
fifth
sixth
seventh
eigth
nineth
tenth

Solution:
1. Using awk.

$ awk 'NR%3' file
first
second
fourth
fifth
seventh
eigth
tenth

My understanding of how it works.

$ man awk

NR The total number of input records seen so far.

$ awk 'NR' file 

Will print all the rows as NR returns a value which equates to true and thus gets printed.
If you want to only print the x row

$ awk 'NR == 2' file
second
$ awk 'NR == 5' file
fifth

The below example prints nothing. It is like saying print but the result is false so nothing gets printed.

$ awk '0' file

So what happens with NR%3 is that anything that is not 0 gets printed.
1%3 result is 1 it gets printed.
2%3 result is 2 it gets printed.
3%3 result is 0 it does not get printed.
4%3 result is 1 it gets printed.
and so on.

2. Using sed.

$ sed 'n;n;d' file
first
second
fourth
fifth
seventh
eigth
tenth

How I think it works.

$ sed 'n;n;d' file

n – move to the next line
n – move to the next line
d – delete that line.

I think that due to d repeating the next cycle, since we still have lines in the file, it starts over and repeats the same till EOF.

$ info sed

`n’
If auto-print is not disabled, print the pattern space, then,
regardless, replace the pattern space with the next line of input.
If there is no more input then `sed’ exits without processing any
more commands.

`d’
Delete the pattern space; immediately start next cycle.

NB: File must not contain blank lines.

Source:
http://www.unix.com/shell-programming-scripting/245088-how-delete-line-number-3-6-9-12-15-so.html

Categories: awk, Interesting, sed Tags:

sed – Convert to Title Case

December 2, 2013 3 comments

Problem:
I needed to ensure that text read from a file was in Title Case. Solution had to be sed 🙂

Solution:
1. This is what I came up with, then I googled and found much simpler solutions that work.

echo "noRTh weST, 0" |sed 's/^\([A-Za-z]\)\([A-Za-z]\+\)/\U\1\L\2/'
North weST, 0

2. A simpler solution

echo "noRTh weST, 0" | sed -e 's/.*/\L&/' -e 's/[a-z]*/\u&/g'
North West, 0

sed -e 's/.*/\L&/' - Convert all the characters to lowercase
sed -e 's/[a-z]*/\u&/g' -
Convert only the first character or each word to uppercase.

\L – Turn the replacement to lowercase until a `\U’ or `\E’ is found,
& – References the whole matched portion of the pattern space.
\u – Turn the next character to uppercase

3. Above solution will fail for the following cases

echo "nor'th we'ST, 0" | sed -e 's/.*/\L&/' -e 's/[a-z]*/\u&/g'
Nor'Th We'St, 0

echo "nor'th we'ST, 0" | sed -e 's/.*/\L&/' -e 's/[[:graph:]]*/\u&/g'
Nor'th We'st, 0

/[[:graph:]]/ – Non-blank character (excludes spaces, control characters, and similar)

Source:
http://bashscripts.org/forum/viewtopic.php?f=21&t=889
http://www.ruby-doc.org/core-1.9.3/Regexp.html

Categories: sed

sed – Replace only certain occurances

November 8, 2013 1 comment

Problem:
I needed to change only certain occurances.

Solution:
Best explained with an example.

To change the first one.

$ echo "little kittens on a little boat" | sed 's/little/big/'
big kittens on a little boat

To change the second one only.

$ echo "little kittens on a little boat" | sed 's/little/big/2'
little kittens on a big boat

To change all the rest starting from the second one.

$ echo "little kittens on a little boat eating little fish" | sed 's/little/big/2g'
little kittens on a big boat eating big fish
Categories: sed

sed – Add 0 values to records in an existing file

May 7, 2013 Leave a comment

Just noting this here for future reference.

Problem:
The full problem is stated here. This is just a simplified version.

Input:

N/A124 14 0.8 1051670971100000
N/A125 15 0.8 1051670971100001
N/A126 16 0.8 1051670971100002
N/A127 17 0.8 1051670971100003

Output:

N/A124000014 0.8 1051670971100000
N/A125000015 0.8 1051670971100001
N/A126000016 0.8 1051670971100002
N/A127000017 0.8 1051670971100003

Solution:
There were other solutions but this one with sed caught my attention.

$ echo "N/A124    14 0.8    1051670971100000
N/A125    15 0.8    1051670971100001
N/A126    16 0.8    1051670971100002
N/A127    17 0.8    1051670971100003" | \
sed ":a; s/^\(.\{6,11\}\) /\10/; ta;"
N/A124000014 0.8    1051670971100000
N/A125000015 0.8    1051670971100001
N/A126000016 0.8    1051670971100002
N/A127000017 0.8    1051670971100003

This is what I think sed is doing. (Please correct me if I’m wrong)
:a – Create a label
s/^\(.\{6,11\}\) /\10/ – Group the first 6 to 11 characters, Substitute the match (ie first 6 to 11 characters, with the captured match and a zero)
ta – If the substitution was successful then branch to label a

t label
If a s/// has done a successful substitution since the last input line was read and since the last t or T command, then branch to label; if label is omitted, branch to end of script.

`\{I,J\}’
Matches between I and J, inclusive, sequences.

Source:
http://www.unix.com/shell-programming-scripting/223047-add-0-values-replace-empty-value.html

Categories: bash, sed Tags:

sed – Convert to upper case

November 28, 2012 1 comment

Problem:
I need to covert text to UPPER CASE using sed.

Solution:

user@computer:~$ echo "first,second,third" |\
sed 's/\([a-z]\+\),/\U\1,/'
FIRST,second,third

Source:
http://nixcraft.com/shell-scripting/15862-sed-convert-text-lower-upper-case.html

Categories: bash, sed

sed – exclude character in search

July 24, 2012 Leave a comment

Problem:
Needed to match characters in a string. Due to the greedy nature of regex in sed I had to be more specific with the search and include [a-z0-9] when matching the text. Fortunately, there is a much simpler way of doing this.

Solution:
Say you have an entry like the one below. (Copied from unix.com)

$less infile
GET /dynamic_branding_playlist.fmil?domain=915oGLbNZhb&pluginVersion=3.2.7_2.6&pubchannel=usa&sdk_ver=2.4.6.3&width=680&height=290&embeddedIn=http%3A%2F%2Fviewster.com%2Fsplash%2FOscar-Videos-1.aspx%3Futm_source%3Dadon_272024_113535_24905_24905%26utm_medium%3Dcpc%26utm_campaign%3DUSYME%26adv %3D573900%26req%3D5006e9ce1ca8b26347b88a7.1.825&sdk_url=http%3A%2F%2Fdivaag.vo.llnwd.net%2Fo42%2Fhtt p_only%2Fviewster_com%2Fv25%2Fyume%2F&viewport=42

You want to match the values of domain and sdk_version. This should work.

$sed 's/.*domain=\([^&]*\).*sdk_ver=\([^&]*\).*/\1 \2/' infile

What I learnt from this was the use of [^&]* which allows the regexp to match any character except the &. The longer way of matching would be [0-9a-zA-Z]*&.

Source:
http://www.unix.com/shell-programming-scripting/194189-extract-key-words-print-their-values.html

Categories: sed