How To Find Human Readable File In Linux – Using Cat Command With Less Pager

Human-readable files contain text you can open and understand without special decoding tools. If you want to know how to find human readable file in linux, you have come to the right place. This guide will show you simple commands and tricks to spot text files instantly.

Linux is packed with files. Some are binary, some are scripts, and many are just plain text. When you need to edit a config file or read a log, you want a human-readable file. Let’s start with the basics and build up to advanced methods.

What Makes A File Human Readable In Linux

A human-readable file contains characters you can read directly. These are typically ASCII or UTF-8 text. Binary files, like executables or images, show garbled symbols when opened in a text editor.

Linux uses the file command to check file types. This command looks at the file’s magic numbers and content. It tells you if a file is “ASCII text” or “UTF-8 Unicode text”. These are your human-readable files.

Other files show as “data”, “ELF 64-bit LSB executable”, or “JPEG image data”. Those are not human-readable in the traditional sense.

How To Find Human Readable File In Linux

The quickest way is using the file command with find. This combo scans directories and filters out binary files. Here is the basic command:

find /path -type f -exec file {} \; | grep -i "text"

This finds all regular files, runs file on each, and shows only lines containing “text”. You get a list of human-readable files instantly.

But wait, there is a better way. The file command can output MIME types. Use this:

find /path -type f -exec file --mime-type {} \; | grep "text/"

This filters files with MIME type starting with “text/”. It catches plain text, HTML, XML, and more.

Using Find With The Readable Test

The find command has a -readable test. It checks if the current user can read the file. Combine it with -type f for regular files:

find /home -type f -readable

This shows all readable files, but not all are human-readable. Binary files can be readable too. You still need the file command to filter.

A more precise one-liner:

find /home -type f -readable -exec sh -c 'file "$1" | grep -q "text"' _ {} \; -print

This prints only files that pass both the readable test and the text check.

Using Grep To Detect Text Content

Another method is using grep to check if a file contains printable characters. The -I option in grep treats binary files as non-matching. Try this:

grep -rIl "" /path

The -r means recursive, -I ignores binary files, -l lists only file names, and the empty pattern matches everything. This returns all files that grep considers text.

This method is fast but not perfect. Some binary files with embedded text might slip through. Still, it works for most cases.

Using The File Command Alone

The file command is your best friend. Run it on a single file:

file example.txt

Output: example.txt: ASCII text

For a directory, use a loop:

for f in /path/*; do file "$f"; done | grep "text"

This checks every file in the directory. It shows only those with “text” in the description.

You can also use find with -exec as shown earlier. That is more efficient for large directories.

Checking MIME Types For Accuracy

MIME types give a standardized way to identify file content. The file command with --mime-type outputs things like:

  • text/plain for plain text
  • text/html for HTML
  • text/x-shellscript for shell scripts
  • text/x-c for C source code

All of these are human-readable. Use this filter:

find /path -type f -exec file --mime-type {} \; | grep "^.*: text/"

This gives you a clean list of human-readable files with their MIME types.

Finding Human Readable Files By Size

Sometimes you want only small text files. Logs and configs are usually under 1 MB. Combine size with text detection:

find /var/log -type f -size -1M -exec file {} \; | grep "text"

This finds text files under 1 megabyte in /var/log. Adjust the size as needed.

For large text files, like database dumps, use -size +10M. The same command works with a larger size.

Excluding Binary Files With Find

The find command has no built-in “text only” filter. But you can use -exec with a script. Here is a reliable one:

find /path -type f -exec sh -c 'file "$1" | grep -q "text"' _ {} \; -print

This runs file on each file and prints only those with “text” in the output. It is slower but accurate.

For better performance, use grep -rIl as mentioned. It is much faster on large directories.

Using The Strings Command

The strings command extracts readable text from any file. It is useful for finding text in binary files, but not for identifying human-readable files directly.

To check if a file is mostly text, compare its size to the output of strings:

if [ $(strings "$file" | wc -c) -gt $(($(stat -c%s "$file") / 2)) ]; then echo "Mostly text"; fi

This is a hacky method. Stick with file and grep for reliable results.

Checking For UTF-8 And Other Encodings

Human-readable files can be UTF-8, ASCII, or ISO-8859-1. The file command detects these:

file --mime-encoding example.txt

Output: example.txt: utf-8

Use this in your find command:

find /path -type f -exec file --mime-encoding {} \; | grep -E "utf-8|us-ascii|iso-8859"

This finds files with common text encodings. It catches most human-readable files.

Practical Examples For Daily Use

Let’s put it all together. Here are real-world scenarios.

Find All Config Files In /Etc

find /etc -type f -exec file {} \; | grep "text" | cut -d: -f1

This lists all text files in /etc. Config files are usually human-readable.

Find Log Files In /Var/log

find /var/log -type f -name "*.log" -exec file {} \; | grep "text"

Log files are often text, but some are compressed. This filters only the text ones.

Find Scripts In Your Home Directory

find ~ -type f -exec file {} \; | grep "shell script"

Shell scripts are human-readable. This finds them anywhere in your home folder.

Using The Locate Command

The locate command is faster than find because it uses a database. But it does not check file content. Combine it with file:

locate "*.txt" | xargs file | grep "text"

This finds all .txt files and checks their type. It is useful for known extensions.

For a broader search, omit the extension:

locate -r ".*" | xargs -I{} file "{}" 2>/dev/null | grep "text"

This searches the entire database. It might be slow with many files.

Updating The Locate Database

Run sudo updatedb before using locate. This ensures the database is current. Without it, you might miss new files.

Using The Find Command With Regex

You can use regex patterns with find to match file names. Common text file extensions include .txt, .md, .conf, .log, .sh, .py, and .html.

find /path -type f -regex ".*\.\(txt\|md\|conf\|log\|sh\|py\|html\)$"

This finds files with those extensions. But it misses text files without extensions. Use the file method for completeness.

Combining Regex With File Check

For best results, use both:

find /path -type f -regex ".*\.\(txt\|md\|conf\)$" -exec file {} \; | grep "text"

This narrows down the search and confirms the type.

Handling Special Characters In File Names

File names with spaces or special characters can break commands. Use null separators with -print0 and xargs -0:

find /path -type f -print0 | xargs -0 file | grep "text"

This handles any file name safely. Always use this in scripts.

Using The -L Option With Symlinks

By default, find does not follow symbolic links. Use -L to follow them:

find -L /path -type f -exec file {} \; | grep "text"

This checks the target of symlinks. Be careful, it might create loops.

Automating The Search With A Script

Create a reusable script. Save this as findtext.sh:

#!/bin/bash
find "$1" -type f -exec file {} \; | grep "text" | cut -d: -f1

Make it executable with chmod +x findtext.sh. Run it with ./findtext.sh /path.

For a more advanced version, add options for size and encoding:

#!/bin/bash
find "$1" -type f -size ${2:-+0} -exec file --mime-type {} \; | grep "text/" | cut -d: -f1

Usage: ./findtext.sh /var/log -1M finds text files under 1 MB.

Using The Script With Cron

Schedule the script to run daily. Edit your crontab with crontab -e and add:

0 2 * * * /home/user/findtext.sh /home/user > /home/user/textfiles.txt

This saves a list of human-readable files every night.

Common Pitfalls And Solutions

Here are mistakes to avoid.

Binary Files With Text Content

Some binary files contain readable strings. PDFs and Word documents have text but are not plain text. The file command identifies them as “PDF document” or “Microsoft Word”. They will not show up in your text search.

If you want to include them, use a different filter. But for true human-readable files, stick with text MIME types.

Empty Files

Empty files are human-readable but contain nothing. The file command says “empty”. Use -s option to check them:

find /path -type f -empty -exec file {} \; | grep "empty"

Decide if you want to include empty files in your results.

Permission Denied Errors

Some files are not readable due to permissions. Use sudo to bypass:

sudo find /root -type f -exec file {} \; | grep "text"

Be careful with system directories. Only use sudo when necessary.

Performance Tips For Large Directories

Searching millions of files can be slow. Here are optimizations.

Limit Depth With -Maxdepth

Use -maxdepth to limit recursion:

find /path -maxdepth 3 -type f -exec file {} \; | grep "text"

This only goes three levels deep.

Use Grep Instead Of File For Speed

The grep -rIl method is much faster than file. Use it for quick scans:

grep -rIl "" /path 2>/dev/null

This skips binary files without checking each one individually.

Exclude Directories With -Prune

Skip directories you do not need:

find /path -path /path/exclude -prune -o -type f -exec file {} \; | grep "text"

This speeds up the search significantly.

Using The Xargs Command For Efficiency

The xargs command runs multiple processes in parallel. Use it with file:

find /path -type f -print0 | xargs -0 -P4 file | grep "text"

The -P4 option runs four processes at once. Adjust based on your CPU cores.

This is much faster on multi-core systems.

Combining With Sort And Uniq

If you have duplicate files, use sort and uniq:

find /path -type f -exec file {} \; | grep "text" | cut -d: -f1 | sort | uniq

This removes duplicate paths from the list.

Real World Use Cases

Let’s see how this helps in daily tasks.

Finding Configuration Files After A System Update

After an update, you might need to check changed configs. Run:

find /etc -type f -mtime -1 -exec file {} \; | grep "text"

This finds text files modified in the last day.

Locating Scripts In A Project

In a development directory, find all scripts:

find /project -type f -exec file {} \; | grep -E "script|text"

This catches shell scripts, Python files, and text configs.

Checking Log Files For Errors

Find recent log files and search for errors:

find /var/log