AWK cheat sheet
This cheat sheet is intended for beginners and regular users of AWK. At the top it includes the basics and common variables and operators. Near the middle and end there are a lot of examples included to showcase how AWK can be used.
Basic usage
Most one-liners or scripts written in AWK consist of an expression and at least one statement. Typically this is something like if the value in the first field is 10, then show field 2 and 3.
Format | Intended action |
---|---|
{statements} | Perform the statement or statements defined within the brackets |
if (expression) {statements} | Perform statement if the logical comparison is true |
if (expression) {statements1} else {statements2} | Execute statements1 group if true, statements2 otherwise |
If the separator of fields is not a space/tab, then it needs to be defined with -F followed by the separator.
awk -F: '{ if($1=="root") {print} }' /etc/passwd
Variables
Within AWK a few variables are used that have a special meaning. They can be used to perform comparisons or to print a result.
Variable name | Usage |
---|---|
$0 | Full line |
$1, $2, $3 … $NF | First, second, third, and last field |
NF | Number of Fields |
NR | Number of Records |
OFS | Output Field Separator (default: " “) |
FS | Field Separator for input (default: " “) |
ORS | Output Record Separator (default: “\n”) |
RS | Record Separator for input (default: “\n”) |
FILENAME | The name of the file |
Example showing the number of fields for each line of the passwd file:
awk -F: '{print NF}' /etc/passwd
Operators
Within AWK it is common to use some kind of operator two compare two values. For example, if value1 is greater than value2.
Operator | Meaning |
---|---|
< | Less than |
<= | Less than, or equal to |
>= | Greater than or equal to |
> | Greater than |
== | Equal to |
!= | Not equal to |
~ | Match or contains (comparing strings) |
!~ | No match (comparing strings) |
&& | Boolean operator (AND) |
|| | Boolean operator (OR) |
Operators are typically used within an if-statement and decide if a statement needs to be executed.
There are also mathematical operators
Arithmetic operator | Meaning |
---|---|
x + y | Addition (2+1=3) |
x - y | Subtraction (5-2=3) |
x * y | Multiplication (2*3=6) |
x % y | Remainder (5%2=1) |
BEGIN and END
Sometimes we want to take an action before we even processed the first line. The BEGIN statement makes this possible. On the opposite END is what performs an action after everything has been processed. This one might be good to summarize information or transform the outcome.
Frequently used snippets
Snippet | Intended goal | Example snippets |
---|---|---|
BEGIN | Perform action before any input is processed | Parse /etc/passwd file |
NR>1 | Only show the line after x (1 in this case) | Parse output of ss command |
NR==2 | Only show the second line | |
END{print NR} | Print the number of records (wc -l) | |
NR%2==0 | Show only the even lines | |
$1==“a” && $2==“b” | Match only if both expression are valid | |
{a[$2]++}END{for(n in a)print n, a[n]} | Count items based on value in field 2, then show number of lines with that value |
Note: we use short notation here for display, NR%2==0 is probably better written as NR % 2 == 0 in your one-liners
Showing output
Usually we want to display the output, which can be done using print or printf. The first function will simply show the output, while the second can do also some formatting. For example, it can show a textual string and format it into a column of a specific size. It can even strip decimals from floating numbers.
Formatting output
The printf function can be used to format a floating number and limit the number of decimals.
echo 8765.4321 | awk '{printf("%.2f\n",$1)}'
The data ($1) comes in via echo and using printf, the number of decimals are reduced to only two. The output will be 8765.43
echo "score=8765.4321" | awk -F= '{printf("%-16s %.1f\n",$1,$2)}'
Data comes in via echo. It needs to be split using the field separator option. We reserve 16 characters for the first field (a text string), then format the number and display with just one decimal
Output:
score 8765.4
Remove data from some columns
Sometimes you may want to show full lines, except one or more columns. This can be done by emptying the column value. For example, if we want to clear out the first two columns, we set both $1 and $2 to an empty string.
awk -F, '{$1=$2=""; print $0}' myfile.csv
Counting results
By using a counter, we can easily see the unique number of entries from a file.
awk '{count[$1]++}; END { for (i in count) print i, count[i] }'
/var/log/nginx/access.log
$1 is the IP address in a default nginx access log.
Want to count the number of occurrences based on a specific pattern only, then we add an if followed by the counter. At the end we use a for loop to display the results.
awk '{if ($9~"NextCloud-News/1.0") { a[$3]++ }} END { for (n in a) print n, a[n] }' file.log
This one-liner searches for the user agent in field 9. For every match, it will increase the counter based on field 3. When we are done with processing, we loop through the results after the END.
Search a specific pattern
If first field equals to pattern, then show full line:
awk '($1 == "pattern") {print $0}' filename
If the line starts with pattern1 or pattern2, then show the second field:
awk '($1 ~ /^(pattern1\|pattern2)/) {print $2}' filename
Using environment variables
Show the username stored in USER. See the export
command for other environment variables that may be available.
awk 'BEGIN { print ENVIRON["USER"] }'
AWK examples
In this section we collect examples using variables, operators, and expressions as listed above.
Parse /etc/passwd file
The passwd file is formatted properly and has not many surprises in its data output. There are a lot of things we can do with it, so time for some examples.
Let’s start with showing its content and add a line number in front of it:
awk -F: '{printf "%2s %s\n", NR, $0}' /etc/passwd
Or we could pull in the username and show the user ID, but separate it with a ‘=’, possibly for further processing:
awk -F: '{print $1 "=" $3}' /etc/passwd
If we want to search for a particular user account, we can do that as well:
awk -F: '/root/ {print $3}' /etc/passwd
Want to return some formatted output and include a nice header? Sure, AWK can do that!
awk -F: 'BEGIN {
printf "%-20s %s\n", "Username", "Home directory"
printf "%-20s %s\n", "--------","--------------"}
{ printf "%-20s %s\n", $1, $(NF-1) }
' /etc/passwd
Output:
Username Home directory
-------- --------------
root /root
daemon /usr/sbin
bin /bin
How does it work? The BEGIN and the next two printf lines show a header. The first field is formatted by reserving 20 characters of space. Finally, the last printf fills two strings (%s). The first string contains the first field ($1) and represents the username. It is also 20 characters wide, so that longer usernames can fit. Then the second string is filled using the field one left from the last one.
We can also transform fields before displaying them. For example, if a user has the /usr/sbin/nologin shell, we can alter the text.
awk -F: 'BEGIN {OFS = FS}{if($7=="/usr/sbin/nologin") $7="Thou Shalt Not Pass!"; print}' /etc/passwd
To set the delimiter to a colon like it normally has in this file, we define the Output Field Separator (OFS) to the Field Separator (FS).
Parse output of ss command
By default, ss shows an output that is easy to read, but not easy to parse.
Example: we want to know what ports are in a listening state (TCP) and UDP ports that are open. Let’s use the ss -lunt
output.
ss -lunt | awk 'NR>1{i=split($5,a,":");print a[i]}'
How does it work?
- NR>1: skip the first line
- i=split($5,a,”:”): split our input in field 5 (delimiter is colon)
- print a[i]: print the last field from each split operation
Output:
53
111
111
53
22
111
22
111
Now with numeric sorting and only show each value once (unique).
ss -lunt | awk 'NR>1{i=split($5,a,":");print a[i]}' | sort -n -u
New output:
22
53
111