Split Command
File Splitting Utility
The split command is a powerful command-line utility
found in Unix-like operating systems, designed to break down large
files into smaller, more manageable segments. This is particularly
useful for transferring large files over networks, managing disk
space, or processing files in chunks. The split command
offers flexibility in how files are divided, allowing users to
specify the size of the output files based on line count or byte
size.
Splitting Text Files by Lines
To split a large text file into smaller files, each containing a
specific number of lines, you can use the -l option.
This is a common use case for processing log files or large
datasets.
# To split a large text file into smaller files of 1000 lines each:
split <file> -l 1000
This command will create files named xaa,
xab, and so on, each containing 1000 lines from the
original <file>. You can customize the prefix of
the output files using an additional argument.
Splitting Binary Files by Size
For binary files, or when you need precise control over file size,
the -b option is used. This allows you to split files
into chunks of a specified size, such as megabytes (M) or kilobytes
(K).
# To split a large binary file into smaller files of 10M each:
split <file> -b 10M
This command will create smaller files, each approximately 10
megabytes in size, from the original <file>.
Consolidating Split Files
After splitting files, you often need to reassemble them back into a
single file. The cat command is the standard tool for
this purpose. By concatenating all the split files in the correct
order, you can restore the original file.
# To consolidate split files into a single file:
cat x* > <file>
This command assumes your split files are named with the default
prefix 'x' and are sequentially ordered (e.g., xaa,
xab, ...). The output is redirected to a new file named
<file>.