An Awk Primer/Using Awk from the Command Line

The Awk programming language was designed to be simple but powerful. It allows a user to perform relatively sophisticated text-manipulation operations through Awk programs written on the command line.

For example, suppose I want to turn a document with single-spacing into a document with double-spacing. I could easily do that with the following Awk program: awk '{print ; print ""}' infile &gt; outfile Notice how single-quotes (' ') are used to allow using double-quotes (" ") within the Awk expression. This "hides" special characters from the shell. We could also do this as follows:

awk "{print ; print \"\"}" infile &gt; outfile —but the single-quote method is simpler.

This program does what it supposed to, but it also doubles every blank line in the input file, which leaves a lot of empty space in the output. That's easy to fix, just tell Awk to print an extra blank line if the current line is not blank:

awk '{print ; if (NF != 0) print ""}' infile &gt; outfile awk 'END {print NR}' infile —but this is dumb, because the "wc (word count)" utility gives the same answer with less bother: "Use the right tool for the job."
 * One of the problems with Awk is that it is ingenious enough to make a user want to tinker with it, and use it for tasks for which it isn't really appropriate. For example, we could use Awk to count the number of lines in a file:

Awk is the right tool for slightly more complicated tasks. Once I had a file containing an email distribution list. The email addresses of various different groups were placed on consecutive lines in the file, with the different groups separated by blank lines. If I wanted to quickly and reliably determine how many people were on the distribution list, I couldn't use "wc", since, it counts blank lines, but Awk handled it easily: awk 'NF != 0 {++count} END {print count}' list
 * Another problem I ran into was determining the average size of a number of files. I was creating a set of bitmaps with a scanner and storing them on a disk.  The disk started getting full and I was curious to know just how many more bitmaps I could store on the disk.

I could obtain the file sizes in bytes using "wc -c" or the "list" utility ("ls -l" or "ll"). A few tests showed that "ll" was faster. Since "ll"

lists the file size in the fifth field, all I had to do was sum up the fifth field and divide by NR. There was one slight problem, however: the first line of the output of "ll" listed the total number of sectors used, and had to be skipped.

No problem. I simply entered: ll | awk 'NR!=1 {s+=$5} END {print "Average: " s/(NR-1)}' This gave me the average as about 40 KB per file.


 * Awk is useful for performing simple iterative computations for which a more sophisticated language like C might prove overkill. Consider the Fibonacci sequence:

1 1 2 3 5 8 13 21 34 ... Each element in the sequence is constructed by adding the two previous elements together, with the first two elements defined as both "1". It's a discrete formula for exponential growth. It is very easy to use Awk to generate this sequence: awk 'BEGIN {a=1;b=1; while(++x&lt;=10){print a; t=a;a=a+b;b=t}; exit}' This generates the following output data: 1   2    3    5    8    13    21    34    55    89