awk

Learn to use the Awk command line tool for pattern scanning and text file processing. Extract columns, filter lines, and manipulate data efficiently.

Awk Command Line Tool

Awk is a powerful text-processing utility that scans files line by line and performs actions based on specified patterns. It's particularly useful for extracting data, filtering records, and generating reports from structured text files.

Awk Basics: Delimiters and Printing

Awk typically uses whitespace as a delimiter. To specify a different delimiter, such as a tab, you can use the -F option.

# Set tab as delimiter and print the entire line
awk -F\\t '{ print $0 }' file.txt

Extracting Columns with Awk

You can easily select and print specific columns (fields) from your text files. Fields are numbered starting from 1.

# Print the first column
awk -F\\t '{ print $1 }' file.txt

# Print the first and second columns, separated by a tab
awk -F\\t '{ print $1"\t"$2 }' file.txt

Conditional Processing with Awk

Awk allows you to apply actions only to lines that match certain conditions. This is crucial for filtering data.

# Print lines where the first column equals '1'
awk -F\\t '$1 == 1 { print $0 }' file.txt > matches_one.txt

# Print lines where the first column does not equal '1'
awk -F\\t '$1 != 1 { print $0 }' file.txt > does_not_match_one.txt

Advanced Column Manipulation

You can also manipulate columns, for example, by removing the first column and printing the rest of the line.

# Remove the first column and print the rest of the line, trimming leading whitespace
awk -F\\t '{ $1=""; print $0 }' file.txt | sed 's/^\s//'

Further Learning Resources