Awk Command Examples - Process Text Files with Awk

Explore practical Awk command examples for processing text files, pattern scanning, and data manipulation. Learn to sum integers, filter lines, and more with Awk.

Awk Command Examples

Awk is a powerful text-processing utility that scans files line by line and performs actions based on specified patterns. It's widely used in Unix-like operating systems for data extraction, reporting, and manipulation.

Sum Integers and Process Fields

Learn how to sum integers from a file or standard input, and how to process fields using different delimiters.

# Sum integers from a file or STDIN, one integer per line.
printf '1\n2\n3\n' | awk '{sum += $1} END {print sum}'

# Using specific character as separator to sum integers from a file or STDIN.
printf '1:2:3' | awk -F ":" '{print $1+$2+$3}'

Filtering Lines with Patterns

Discover how to select specific lines from a file based on line numbers, regular expressions, or field values.

# Print line number 12 of file.txt
awk 'NR==12' file.txt

# Print first field for each record in file.txt
awk '{print $1}' file.txt

# Print only lines that match regex in file.txt
awk '/regex/' file.txt

# Print only lines that do not match regex in file.txt
awk '!/regex/' file.txt

# Print any line where field 2 is equal to "foo" in file.txt
awk '$2 == "foo"' file.txt

# Print lines where field 2 is NOT equal to "foo" in file.txt
awk '$2 != "foo"' file.txt

# Print line if field 1 matches regex in file.txt
awk '$1 ~ /regex/' file.txt

# Print line if field 1 does NOT match regex in file.txt
awk '$1 !~ /regex/' file.txt

# Print lines with more than 80 characters in file.txt
awk 'length > 80' file.txt

Advanced Awk Techniques

Explore more advanced Awk features such as custom output separators, paragraph processing, and accessing environment variables.

# Print a multiplication table.
awk -v RS='' '
    {
        for(i=1;i<=NF;i++){
            printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")
        }
    }
' <<< "$(seq 9 | sed 'H;g')"

# Specify output separator character.
printf '1 2 3' | awk 'BEGIN {OFS=":"}; {print $1,$2,$3}'

# Search paragraph for the given REGEX match.
# Paragraphs will be collapsed together.
awk -v RS='' '/42B/' file

# Search paragraph for the given REGEX match.
# Paragraphs will be separated with a new line.
awk -v RS= ORS='\n\n' '/42B/' file

# Display only first field in text taken from STDIN.
echo 'Field_1  Field_2  Field_3' | awk '{print $1}'
# Note that in this case, you're far better off using cut(1).

# Use AWK solo; without the need for something via STDIN.
awk 'BEGIN {print("Example text.")}'

# Access environment variables from within AWK.
awk 'BEGIN {print ENVIRON["LS_COLORS"]}'

# Count number of lines taken from STDIN.
free | awk '{L++} END {print(L)}'
# Cleaner, more efficient approach to the above.
free | awk 'END {print(NR)}'

# Output unique list of available sections under which to create a DEB package.
awk '!A[$1]++ {print($1)}' <<< "$(dpkg-query --show -f='${Section}\n')"

# Using process substitution (`<()` is NOT command substitution), with AWK and
# its associative array variables, we can print just column 2 for lines whose
# first column is equal to what's between the double-quotes.
awk '{NR != 1 && A[$1]=$2} END {print(A["Mem:"])}' <(free -h)
# While below is an easier and simpler solution to the above, it's not at all
# the same, and in other cases, the above is definitely preferable.
awk '/^Mem:/ {print($2)}' <(free -h)

# Output list of unique uppercase-only, sigil-omitted variables used in FILE.
awk '
    {
        for(F=0; F<NF; F++){
            if($F~/^\$[A-Z_]+$/){
                A[$F]++
            }
        }
    }
    END {
        for(I in A){
            X=substr(I, 2, length(I))
            printf("%s\n", X)
        }
    }
' FILE

# Output only lines from FILE between PATTERN_1 and PATTERN_2. Good for logs.
awk '/PATTERN_1/,/PATTERN_2/ {print}' FILE

# Pretty-print a table of an overview of the non-system users on the system.
awk -F ':' '
    BEGIN {
        printf("%-17s %-4s %-4s %-s\n", "NAME", "UID", "GID", "SHELL")
    }
    $3 >= 1000 && $1 != "nobody" {
        printf("%-17s %-d %-d %-s\n", $1, $3, $4, $7)
    }
' /etc/passwd

# Display the total amount of MiB of RAM available in the machine. This is also
# a painful but useful workaround to get the units comma-separated, as would be
# doable with Bash's own `printf` built-in.
awk '/^MemTotal:/ {printf("%'"'"'dMiB\n", $2 / 1024)}'

# It's possible to sort strings in AWK, as well as uniq-ing, meaning you can
# replace uniq(1) and sort(1) calls with just the one call of AWK. Granted, you
# can use `sort -u` to do both, but AWK offers much more functionality.
#
# Unlike when using AWK to uniq-ify, uniq(1) only works by adjacency, meaning
# the duplicate lines must be adjacent to one another for them to be handled.
awk '
	{
		!Lines[$0]++
	}
	END {
		asorti(Lines, Sorted)
		for (Line in Sorted) {
			print(Sorted[Line])
		}
	}
' FILE

# Remove duplicate lines
awk '!seen[$0]++' file.txt

# Remove all empty lines
awk 'NF > 0' file.txt

Further Resources

For more in-depth information and advanced usage of Awk, refer to the official documentation and community resources.