Sort Command - Linux Text File Sorting Utility | Online Free DevTools by Hexmos - cmd Cheatsheets

Understanding the Linux Sort Command

The sort command in Linux is a powerful utility used to sort lines of text files. It can arrange data alphabetically, numerically, or based on specific fields within each line. This makes it indispensable for data processing and analysis directly from the command line. Understanding its various options allows for flexible manipulation of text-based data.

Sorting Lines Alphabetically and Reverse

By default, sort arranges lines based on the ASCII table. The -r flag reverses the sort order, which is useful for displaying data from largest to smallest or Z to A.

# sort
# Sort lines of text files

# Return the contents of the British English dictionary, in reverse order.
sort -r /usr/share/dict/british-english

Filtering Adjacent Duplicate Lines with Sort

While uniq is primarily for removing duplicate lines, sort with the -u flag can filter out adjacent duplicate lines after sorting. For more complex uniqueness checks or when sort's capabilities are insufficient, consider using tools like AWK.

# The GNU sort(1) command can also filter out adjacent duplicate lines and can
# therefore overlap with the uniq(1) command. However, uniq(1) has some options
# that sort(1) cannot do so refer to the man page for you situation if you 
# require something beyond a basic uniqueness check. In addition, there is the
# potential for parallizing the processing by piping sort(1) into uniq(1) for 
# non trivial tasks.
#
# By default, sort(1) sorts lines or fields using the ASCII table. Here, we're
# essentially getting alphanumeric sorting, where case is handled separately; -
# this results in these words being adjacent to one another, thus duplicates
# are removed.
#
# If you need better uniq-ing, you could refer to AWK & its associative arrays.
printf '%s\n' this is a list of of random words with duplicate words | sort -u

Numerical Sorting

When sorting numbers, it's crucial to use the -n flag. Without it, sort treats numbers as strings, leading to incorrect ordering (e.g., 1, 10, 11, 2). The -n flag ensures proper numerical comparison.

# Sort numerically. If you don't provide the `-n` flag, sort(1) will instead
# sort by the ASCII table, as mentioned above, meaning it'll display as 1, 10, -
# 11, 2, 3, 4, etc.
printf '%d\n' {1..9} 10 11 | sort -n

Sorting Human-Readable Sizes

The sort command can also handle human-readable file sizes (like KB, MB, GB) using the -h flag. Combined with the -k flag to specify the sort key (column) and -r for reverse order, it's excellent for analyzing disk usage from commands like df.

# You can even sort human-readable sizes. In this example, the 2nd column is
# being sorted, thanks to the use of the `-k` flag, and the sorting is
# reversed, so that the top-most storage space hungry filesystems are displayed
# from df(1).
df -ht ext4 /dev/sd[a-z][1-9]* | sed '1d' | sort -rhk 2

Sort Command - Linux Text File Sorting Utility | Online Free DevTools by Hexmos

Sort Command Examples

Understanding the Linux Sort Command

Sorting Lines Alphabetically and Reverse

Filtering Adjacent Duplicate Lines with Sort

Numerical Sorting

Sorting Human-Readable Sizes

External Resources