Uniq Command - Report or Omit Repeated Lines | Online Free DevTools by Hexmos

Utilize the Uniq command to report or omit repeated lines from sorted input. Learn options like -c, -d, -u, and -i for precise line filtering.

Uniq Command

Uniq: Report or Omit Repeated Lines

The uniq command is a powerful command-line utility used to process text files by reporting or omitting repeated lines. It is essential for data cleaning and analysis, especially when dealing with sorted input. To effectively use uniq, the input file must be sorted beforehand, as it compares adjacent lines.

Key Features and Options

The uniq command offers several options to customize its behavior:

  • -c: Show repetition counts
    This option prefixes each output line with the number of times it occurred in the input.
  • -d: Print only repeated lines
    With this flag, uniq will only output lines that appear more than once in the input, and each such line will be printed only once.
  • -u: Print only unique lines
    This option ensures that only lines that appear exactly once in the input are printed.
  • -i: Case-insensitive comparison
    This flag makes the comparison of lines case-insensitive, treating 'Apple' and 'apple' as identical.

Usage Example

To see how many times each line appears in a sorted file named sorted_data.txt, you would use:

sort sorted_data.txt | uniq -c

Understanding Sorted Input

It is crucial to remember that uniq only compares adjacent lines. If your data is not sorted, you might get unexpected results. For instance, if a line appears twice but with other lines in between, uniq will not consider them as duplicates unless the file is sorted first.

External Resources

Option Description
-c Show how many times a line is repeated.
-d Prints only the repeated lines only once.
-u Prints only the unique lines.
-i Case insensitive comparison.