awk
Learn to use the Awk command line tool for pattern scanning and text file processing. Extract columns, filter lines, and manipulate data efficiently.
Awk Command Line Tool
Awk is a powerful text-processing utility that scans files line by line and performs actions based on specified patterns. It's particularly useful for extracting data, filtering records, and generating reports from structured text files.
Awk Basics: Delimiters and Printing
Awk typically uses whitespace as a delimiter. To specify a different delimiter, such as a tab, you can use the -F
option.
# Set tab as delimiter and print the entire line
awk -F\\t '{ print $0 }' file.txt
Extracting Columns with Awk
You can easily select and print specific columns (fields) from your text files. Fields are numbered starting from 1.
# Print the first column
awk -F\\t '{ print $1 }' file.txt
# Print the first and second columns, separated by a tab
awk -F\\t '{ print $1"\t"$2 }' file.txt
Conditional Processing with Awk
Awk allows you to apply actions only to lines that match certain conditions. This is crucial for filtering data.
# Print lines where the first column equals '1'
awk -F\\t '$1 == 1 { print $0 }' file.txt > matches_one.txt
# Print lines where the first column does not equal '1'
awk -F\\t '$1 != 1 { print $0 }' file.txt > does_not_match_one.txt
Advanced Column Manipulation
You can also manipulate columns, for example, by removing the first column and printing the rest of the line.
# Remove the first column and print the rest of the line, trimming leading whitespace
awk -F\\t '{ $1=""; print $0 }' file.txt | sed 's/^\s//'
Further Learning Resources
- GNU Awk Manual: The official documentation for Awk.
- MDN Web Docs - Date: While not directly Awk, understanding date formats is often relevant when processing text files.
- Stack Overflow - Awk Tag: Find answers to common Awk questions and solutions.