How To Merge Two Files In Linux – Cat Command File Merging

Combining text files in the Linux terminal requires just a single command with the right syntax. If you are wondering how to merge two files in linux, you have come to the right place. This guide walks you through every method, from simple concatenation to advanced merging with sorting and deduplication.

Merging files is a common task for developers, system administrators, and data analysts. Linux offers several built-in tools that make this process fast and efficient. You do not need any special software or programming skills.

Let us start with the simplest approach and then explore more powerful options. Each method has its own use case, and we will cover them all.

How To Merge Two Files In Linux

The most straightforward way to merge two files is using the cat command. This command stands for “concatenate” and it works exactly as its name suggests.

Open your terminal and navigate to the directory containing your files. Suppose you have two files named file1.txt and file2.txt. To merge them into a new file called merged.txt, run:

cat file1.txt file2.txt > merged.txt

This command reads both files sequentially and writes their contents to the output file. The order matters: file1.txt content appears first, followed by file2.txt.

If you want to append one file to the end of another without creating a new file, use the append operator:

cat file2.txt >> file1.txt

This adds the contents of file2.txt to the end of file1.txt. Be careful with this method because it modifies the original file.

Merging Multiple Files With Cat

You are not limited to just two files. The cat command can merge any number of files at once:

cat file1.txt file2.txt file3.txt file4.txt > combined.txt

You can also use wildcards to merge all files of a certain type:

cat *.txt > all_text_files.txt

This merges every .txt file in the current directory into one large file. The order follows alphabetical sorting by default.

Adding Line Numbers During Merge

Sometimes you need to see which line came from which file. The cat -n option adds line numbers to the output:

cat -n file1.txt file2.txt > numbered_merge.txt

This numbers all lines sequentially from start to finish. It is helpful for debugging or reviewing merged logs.

Using The Paste Command For Side-By-Side Merging

The paste command merges files horizontally instead of vertically. It places lines from each file next to each other, separated by a tab character.

To merge two files side by side:

paste file1.txt file2.txt > side_by_side.txt

This is useful when you have related data in separate files, like names in one file and addresses in another. Each line from file1.txt appears on the same line as the corresponding line from file2.txt.

Changing The Delimiter In Paste

By default, paste uses a tab as the separator. You can change this with the -d option:

paste -d ',' file1.txt file2.txt > csv_merge.csv

This creates a comma-separated output, which is perfect for CSV files. You can use any character as the delimiter, including spaces, colons, or pipes.

Handling Uneven Line Counts

If your files have different numbers of lines, paste still works. It fills missing positions with empty strings. You can specify a replacement string using the -s option for serial merging.

For example, to merge three files with a colon separator and fill gaps with “NA”:

paste -d ':' -s file1.txt file2.txt file3.txt

This is less common but good to know for edge cases.

Merging Files With The Join Command

The join command merges files based on a common field. It works like a database join operation. This is more advanced than simple concatenation.

Both files must be sorted on the join field for this to work correctly. Suppose you have two files with user IDs:

users.txt contains:

1 Alice
2 Bob
3 Charlie

emails.txt contains:

1 alice@example.com
2 bob@example.com
3 charlie@example.com

To merge them on the first field (the ID):

join users.txt emails.txt > merged_users.txt

The output combines matching lines based on the common ID field. Lines without a match are omitted by default.

Specifying Join Fields

You can specify which field to use for joining with the -1 and -2 options. -1 refers to the field in the first file, -2 to the second file:

join -1 2 -2 1 file1.txt file2.txt

This joins on the second field of the first file and the first field of the second file. Field numbers start at 1.

Handling Unmatched Lines

Use the -a option to include unmatched lines from one or both files:

join -a 1 file1.txt file2.txt

This includes all lines from the first file, even if there is no match in the second file. Unmatched fields appear as empty.

Merging Files With Sorting And Deduplication

Sometimes you need to merge files and then sort the combined content. The sort command can handle this in one step:

sort file1.txt file2.txt > sorted_merged.txt

This merges both files and sorts the output alphabetically or numerically. You can also remove duplicate lines with the -u option:

sort -u file1.txt file2.txt > unique_sorted_merge.txt

This is perfect for merging lists where duplicates are not wanted.

Merging And Sorting By Specific Columns

Use the -k option to sort by a specific column after merging:

sort -k 2 file1.txt file2.txt > sorted_by_column2.txt

This sorts the merged output by the second column. Combine with -t to specify a custom delimiter for columns.

Using Awk For Advanced Merging

The awk command is a powerful text processing tool that can merge files with custom logic. It is more flexible than cat or paste.

To merge two files line by line:

awk 'NR==FNR {a[NR]=$0; next} {print a[FNR], $0}' file1.txt file2.txt

This stores each line of the first file in an array, then prints it alongside the corresponding line from the second file. You can modify the output format easily.

Merging With Conditional Logic

Awk allows you to add conditions during merging. For example, to merge only lines that contain a specific word:

awk '/keyword/ {print}' file1.txt file2.txt > filtered_merge.txt

This merges only lines containing “keyword” from both files. You can combine multiple conditions with logical operators.

Using Sed For Simple Merging

The sed command is another option for merging files, though it is less intuitive. You can read the contents of one file and insert it into another:

sed -e 'r file2.txt' file1.txt > merged_sed.txt

This reads file2.txt and appends its contents after every line of file1.txt. To append only at the end, use:

sed -e '$r file2.txt' file1.txt > merged_sed_end.txt

The $ address specifies the last line, so the content is added only once at the end.

Merging Binary Files

Merging binary files requires caution. The cat command works for binary files too, but the result may not be usable depending on the file type.

For example, to merge two image files:

cat image1.jpg image2.jpg > combined.jpg

This creates a file that contains both images, but most image viewers will only show the first one. For proper binary merging, you need file-specific tools.

For video files, use ffmpeg or mkvmerge. For PDFs, use pdftk or qpdf. Always test the output to ensure it works correctly.

Merging Files With Headers

When merging CSV or TSV files with headers, you usually want to keep the header only once. The tail command helps here:

head -1 file1.csv > merged.csv && tail -n +2 file1.csv >> merged.csv && tail -n +2 file2.csv >> merged.csv

This extracts the header from the first file, then appends all data rows from both files. It avoids duplicating the header line.

For a simpler approach, use awk:

awk 'NR==1 || FNR>1' file1.csv file2.csv > merged.csv

This prints the first line of the first file and skips the first line of subsequent files.

Merging Compressed Files

You can merge compressed files without decompressing them first using zcat for gzip files or bzcat for bzip2 files:

zcat file1.gz file2.gz > merged.txt

This decompresses both files on the fly and merges them into one plain text file. To keep the output compressed:

zcat file1.gz file2.gz | gzip > merged.gz

This pipes the merged content directly into a new gzip file.

Automating File Merging With Scripts

For repetitive merging tasks, create a simple shell script. Save the following as merge_files.sh:

#!/bin/bash
cat "$1" "$2" > "$3"
echo "Merged $1 and $2 into $3"

Make it executable with chmod +x merge_files.sh. Then run:

./merge_files.sh file1.txt file2.txt output.txt

You can expand the script to handle multiple files, add error checking, or include sorting options.

Common Mistakes When Merging Files

One frequent error is using the wrong redirection operator. Using > overwrites the output file, while >> appends. Accidentally using > when you meant >> can destroy existing data.

Another mistake is forgetting that cat without redirection prints to the terminal. If you run cat file1.txt file2.txt without > or >>, the output scrolls on your screen.

Also, ensure your files exist and are readable. Use ls -l to check file permissions before merging.

Performance Considerations

For small files, any method works fine. For very large files (gigabytes), cat is the fastest because it does minimal processing. The sort command can be slow on large datasets because it requires temporary storage.

If you need to merge large files with sorting, consider using sort --parallel=4 to use multiple CPU cores. This speeds up the process significantly.

Avoid using awk or sed for large binary files, as they treat everything as text and may corrupt the data.

Frequently Asked Questions

What Is The Easiest Way To Merge Two Text Files In Linux?

The easiest way is using the cat command: cat file1.txt file2.txt > merged.txt. It requires no special options and works instantly.

Can I Merge Files Without Creating A New File?

Yes, use the append operator: cat file2.txt >> file1.txt. This adds the content of the second file to the end of the first file.

How Do I Merge Files Side By Side In Linux?

Use the paste command: paste file1.txt file2.txt > output.txt. This places lines from each file next to each other with a tab separator.

What Command Merges Files Based On A Common Field?

The join command merges files based on a common field. Both files must be sorted on that field for correct results.

How Do I Merge Multiple CSV Files With Headers?

Use awk 'NR==1 || FNR>1' file1.csv file2.csv file3.csv > merged.csv. This keeps the header only from the first file.

Now you have a complete understanding of how to merge two files in linux. From simple concatenation with cat to advanced joins with awk, you have multiple tools at your disposal. Choose the method that best fits your data and workflow.

Practice these commands with sample files to build confidence. The terminal is a powerful environment, and mastering file merging is a fundamental skill that will save you time and effort.