Using the find command

The find command is used to recursively locate files in a directory hierarchy. Since programmers and system administrators spend a great deal of time working with files, familiarity with this command can make each more efficient at the terminal.

The command is composed of four parts:

  • the command name
  • options that control how the command searches (optional)
  • the path(s) to search (required)
  • expressions (composed of “primaries) and “operators” that filter files by name (optional)

A primary is a switch, such as -name or -regex that may or may not have additional arguments. An operator is a primary such as -or and -not that combines expressions in logical ways.

Basic usage

Finding files by name is perhaps the most common use of the find command. Its output consists of all paths in which the file name appears within the directory structure it searches.

1
2
3
4
5
6
$ find . -name README.md
./node_modules/hexo-renderer-stylus/README.md
./node_modules/is-extendable/README.md
./node_modules/striptags/README.md
...
./README.md

NOTE: All terminal examples were generated within the directory structure of my blog, created by the Hexo static site generator.

The path given to find in the first argument is the path prepended to each result. The . path instructs find to search under the working directory and generate relative paths in the output. If an absolute path to the working directory is used, however, the full path will appear in results. Command substitution may be used to strike a compromise between brevity and more detailed output.

1
2
3
4
5
6
$ find `pwd` -name README.md
/Users/nicholascloud/projects/nicholascloud.com/node_modules/hexo-renderer-stylus/README.md
/Users/nicholascloud/projects/nicholascloud.com/node_modules/is-extendable/README.md
/Users/nicholascloud/projects/nicholascloud.com/node_modules/striptags/README.md
...
/Users/nicholascloud/projects/nicholascloud.com/README.md

If a filtering expression is omitted, find will return all file paths within its search purview.

1
2
3
4
5
6
7
$ find .
.
./scaffolds
./scaffolds/draft.md
./scaffolds/post.md
./scaffolds/page.md
...

Other scenarios

The find command can do much more than locate files by exact name, however. It can find files according to a wide swath of criteria in many different scenarios.

We may not know the exact name of the file for which we are searching

The -name primary supports wildcard searches.

  • an asterisk (*) can replace any consecutive number of characters
  • a question mark (?) can replace any single character
1
2
3
4
5
$ find . -name '*javascript*.md'
./source/_posts/javascript-frameworks-for-modern-web-dev.md
./source/_posts/l33t-literals-in-javascript.md
./source/_posts/historical-javascript-objects.md
./source/_posts/maintainable-javascript-book-review.md

By default, the -name primary is case-sensitive. To conduct a case-insensitive search, we can use the -iname primary.

1
2
3
4
5
6
$ find . -iname '*JAVA*.md'
./source/_posts/java-4-ever.md
./source/_posts/javascript-frameworks-for-modern-web-dev.md
./source/_posts/l33t-literals-in-javascript.md
./source/_posts/historical-javascript-objects.md
./source/_posts/maintainable-javascript-book-review.md

For more complex searches, we can use the power of regular expressions with the -regex and -iregex (case-insensitive) primaries.

NOTE: Use the -E option to specify that extended regular expressions should be used instead of basic regular expressions.

1
2
3
4
5
6
7
8
 $ find -E . -regex '.*package(-lock)?\.json'
./node_modules/hexo-renderer-stylus/package.json
./node_modules/is-extendable/package.json
./node_modules/striptags/package.json
./node_modules/babylon/package.json
...
./package-lock.json
./package.json

We may want to limit our results to a specific path, or a pattern that matches multiple paths

To filter results by a path mask, we can specify a pattern with -path and -ipath (case-insensitive). Both support the asterisk and question mark wildcards.

1
2
3
4
5
6
7
8
9
10
11
$ find . -path './node_modules/*/lib/*' -regex '.*hexo.*'
./node_modules/hexo-renderer-stylus/lib/renderer.js
./node_modules/hexo-renderer-marked/lib/renderer.js
./node_modules/hexo-generator-archive/lib/generator.js
./node_modules/hexo-migrator-wordpress/node_modules/async/lib/async.js
./node_modules/hexo-log/lib/log.js
./node_modules/hexo-generator-category/lib/generator.js
./node_modules/hexo-i18n/lib/i18n.js
./node_modules/hexo-pagination/lib/pagination.js
./node_modules/hexo-generator-index/lib/generator.js
./node_modules/hexo-util/lib/pattern.js

Using the -path primary does not change the top-level directory in which find begins its search; it merely filters results by sub-directory.

We may want detailed information about the files we find

To see detailed information about a file, in a manner similar to ls -l, the -ls primary may be appended to the list of primaries.

1
2
3
4
5
$ find . -name *.md -path '*/_posts/*' -ls
8636130637 8 -rw-r--r-- 1 nicholascloud staff 490 Oct 24 12:25 ./source/_posts/strange-loop-2010-video-release-schedule-posted.md
8637286945 8 -rw-r--r-- 1 nicholascloud staff 1570 Oct 24 12:45 ./source/_posts/god-mode-in-windows-7-not-as-cool-as-rise-of-the-triad.md
8636130716 8 -rw-r--r-- 1 nicholascloud staff 204 Oct 24 12:25 ./source/_posts/what-writing-fiction-taught-me-about-writing-software.md
...

(As an alternative to the -ls primary, the -exec primary may be used to invoke ls, or the xargs command may be used for the same purpose.)

We may want to stop descending into a hierarchy once we’ve found the file(s) for which we’ve searched

The -prune primary causes find to stop traversing a particular directory path once it has found a result that matches its expression. It will, however, continue to search at the same directory level as a found result for other potential matches.

1
2
3
4
5
6
 $ find . -regex '.*middleware.*' -prune
./source/_posts/new-appendto-blog-post-streams-and-middleware-in-strata-js.md
./node_modules/stylus/lib/middleware.js
./node_modules/hexo-server/lib/middlewares
./public/2013/06/new-appendto-blog-post-streams-and-middleware-in-strata-js
./.deploy_git/2013/06/new-appendto-blog-post-streams-and-middleware-in-strata-js

By using the diff tool and IO redirection we can compare the output of a “pruned” result set with the output of unpruned results to see what paths were omitted. For example, in the diff below, the remaining paths that matched /node_modules/hexo-server/lib/middlewares/* were omitted once /node_modules/hexo-server/lib/middlewares had been added to the result set.

1
2
3
4
5
6
7
8
9
10
11
12
$ diff <(find . -regex '.*middleware.*') <(find . -regex '.*middleware.*' -prune)
4,9d3
< ./node_modules/hexo-server/lib/middlewares/route.js
< ./node_modules/hexo-server/lib/middlewares/redirect.js
< ./node_modules/hexo-server/lib/middlewares/logger.js
< ./node_modules/hexo-server/lib/middlewares/gzip.js
< ./node_modules/hexo-server/lib/middlewares/header.js
< ./node_modules/hexo-server/lib/middlewares/static.js
11d4
< ./public/2013/06/new-appendto-blog-post-streams-and-middleware-in-strata-js/index.html
13d5
< ./.deploy_git/2013/06/new-appendto-blog-post-streams-and-middleware-in-strata-js/index.html

We may only want to search to a particular depth OR search beyond a particular depth

Several primaries control depth traversal, or how far find will go to locate results.

-maxdepth controls the path depth to which find will traverse before stopping.

1
2
3
4
5
$ find . -name *.css -maxdepth 3
./public/fancybox/jquery.fancybox.css
./public/css/style.css
./.deploy_git/fancybox/jquery.fancybox.css
./.deploy_git/css/style.css

-mindepth controls the path depth at which find will start to search.

1
2
3
4
5
6
$ find . -name *.css -mindepth 6
./node_modules/hexo/node_modules/hexo-cli/assets/themes/landscape/source/fancybox/jquery.fancybox.css
./node_modules/hexo/node_modules/hexo-cli/assets/themes/landscape/source/fancybox/helpers/jquery.fancybox-thumbs.css
./node_modules/hexo/node_modules/hexo-cli/assets/themes/landscape/source/fancybox/helpers/jquery.fancybox-buttons.css
./themes/landscape/source/fancybox/helpers/jquery.fancybox-thumbs.css
./themes/landscape/source/fancybox/helpers/jquery.fancybox-buttons.css

-depth specifies the exact depth at which find will search.

1
2
3
4
$ find . -name *.css -depth 5
./node_modules/async-limiter/coverage/lcov-report/prettify.css
./node_modules/async-limiter/coverage/lcov-report/base.css
./themes/landscape/source/fancybox/jquery.fancybox.css

We may want to find files that are newer/older relative to another file

The -newer primary will find files that are newer than the specified file by comparing the modification times of each.

1
2
3
4
5
6
7
$ ls -l source/_drafts/
-rw-r--r-- 1 nicholascloud staff 189 Nov 7 11:51:20 2018 the-importance-of-names.md
-rw-r--r-- 1 nicholascloud staff 353 Nov 7 11:50:49 2018 the-most-satisfying-thing.md
-rw-r--r-- 1 nicholascloud staff 10812 Nov 8 19:13:09 2018 using-the-find-command.md

$ find . -newer source/_drafts/the-importance-of-names.md -path '*_drafts*'
./source/_drafts/using-the-find-command.md

For more fine-grained control, use the -newer[XY] primary, where values of X and Y represent different kinds of file timestamps (see table below). The timestamp for X applies to the files that find evaluates; that of Y applies to the file path argument supplied for comparison

X/Y flags value
a access time
B inode creation time
c change time (file attributes)
m modification time (file contents)
t (y only) file is interpreted as a date understood by cvs(1)

For example, the command find . -neweram foo.txt will find all files that have a newer access time than the modification time of foo.txt.

For each X flag there are shortcut primaries that make a comparison against the modification time of the file argument.

  • -anewer compares the access time of each file in the result set to the modification time of the specified file.
  • -bnewer compares the inode creation time of each file in the result set to the modification time of the specified file.
  • -cnewer compares the change time of each file in the result set to the modification time of the specified file.
  • -mnewer compares the modification time of each file in the result set to the modification time of the specified file, and is identical to -newer.

In Unix-like systems, “everything is a file”, and these files have types. The find command can detect file type, and filter results accordingly. Regular files (for which we search most often) have a type of f; directories have a type of d. Block files – disks, for example – have a type of b.

In OSX it is easy to find all block files that represent disks (physical and logical).

1
2
3
4
5
6
7
8
9
$ find /dev -name 'disk*' -type b
/dev/disk0
/dev/disk0s1
/dev/disk0s2
/dev/disk1
/dev/disk1s2
/dev/disk1s3
/dev/disk1s1
/dev/disk1s4

The table below lists each file type that the find command may detect.

Flag Meaning
b block special
c character special
d directory
f regular file
l symbolic link
p FIFO
s socket

We may want to search for files that a particular user or group owns (or inversely, that are not owned by a known user or group)

Users and groups are identified by name and numeric ID on Unix-like systems. In OSX the id command tells me my user ID and group ID(s).

1
2
$ id
uid=501(nicholascloud) gid=20(staff)...

The find command accepts primaries that filter file results by user and/or group.

  • -uid <uid> and -user <username> filter results by owning user. If the argument to -user is numeric, and no group exists with that name, it is assumed to be a user ID.
  • -gid <id> and -group <groupname> filter results by owning group. The same caveat applies to groupname as username.

I write code for a website called OwlEyes.org, which is a PHP application served by the apache2 web server. If I search for files in my home directory owned by the www-data user (the typical apache2 user), I see some interesting results.

1
2
3
$ find . -user www-data
./projects/enotes/owleyesorg/app/logs/apache-error.log
./projects/enotes/owleyesorg/app/logs/apache-custom.log

Every other file in my project directory is owned by my user, but the apache2 log files are written by the web server, and are therefore owned by its user.

To find files that aren’t owned by any known user and/or group, the inverse primaries may be used.

  • -nouser <username> shows results that do not belong to a known user.
  • -nogroup <groupname> shows results that do not belong to a known group.

We may want to find empty files or directories

To find empty (0 byte files or directories with no files) files append the -empty primary to the find command. This can be useful, for example, to see what log files are empty on your system.

1
2
3
4
5
6
7
8
9
$ sudo find /var/log -empty
./appfirewall.log
./ppp
./alf.log
./apache2
./com.apple.xpc.launchd
./cups
./CoreDuet
./uucp

We may want to run a utility on the files identified by find

The -exec and -ok primaries may be used to run a command on each file in find‘s result set. The two primaries are identical but -ok will request user confirmation for each file before executing the specified command.

The syntax for executing a command with find is:

1
find <expression(s)> -exec <command> \;

The command is written in standard form, as you would type it in a terminal. If the string '{}' appears anywhere in the command, it will be replaced by the file path of each result as find iterates over them. Commands must be terminated by a \;. (The escape character is necessary when executing within a shell environment.)

The command find . -newer db.json -type f -exec cp '{}' ~/tmp \;:

  • starts in the current directory
  • finds files that were modified after db.json (the database file that stores blog post information)
  • finds files of type “regular file”
  • and copies each one to ~/tmp
1
2
3
4
5
6
7
8
$ ls -lh db.json
-rw-r--r-- 1 nicholascloud staff 2.6M Nov 8 19:15 db.json

$ find . -newer db.json -type f -exec cp '{}' ~/tmp \;

$ ls -l ~/tmp
-rw-r--r-- 1 nicholascloud staff 61B Nov 9 10:46 README.md
-rw-r--r-- 1 nicholascloud staff 14K Nov 9 10:46 using-the-find-command.md

Two corresponding primaries, -execdir and -okdir do the same thing as -exec and -ok, however '{}' is replaced with as many file paths as possible from the result set, making these primaries akin to xargs. For example, to archive files in a find result set, one could use -execdir to create a tarball.

1
2
3
$ find . -newer db.json -type f -execdir tar cvzf ~/tmp/back.tar.gz '{}' \;
a using-the-find-command.md
a README.md

We may want to format find‘s output

The output from find can be formatted in two ways.

By specifying the -print primary, the file path of each result in find‘s result set is printed to standard output, terminated by a newline. This is the way find displays results by default. However, some primaries, such as -exec, might not print each file to the terminal. The command find . -newer db.json -type f -print -exec cp '{}' ~/tmp \; will copy all files newer than db.json to ~/tmp, but the output will remain empty (the default behavior of the cp command). To force each file to be displayed, the -print primary may be added before -exec.

1
2
3
$ find . -newer db.json -type f -print -exec cp '{}' ~/tmp \;
./source/_drafts/using-the-find-command.md
./README.md

The -print0 primary creates a space-delimited string of all file paths, and can be useful when piping the output of find to xargs or some similar command that expects input in such a format.

By default primaries are combined and applied together to form an expression, but find supports two operators that change the way expressions are applied. If two expressions are separated by the -or operator, then they will be applied in a boolean OR fashion; results will be returned that match either expression, or both.

1
2
3
4
5
6
7
$ find . -name '*eclipse*' -or -name '*clean*'
./source/images/2011/10/eclipse-example.png
./source/images/2011/10/eclipse-example-150x127.png
...
./source/images/2011/08/clean-coders1.png
./source/images/2011/08/clean-coders-150x117.png
...

If the -not (or !) operator precedes an expressison, it will negate it and remove matching file paths from the result set.

1
2
3
4
5
6
7
8
9
$ find . -name '*eclipse*'
./source/images/2011/10/eclipse-example.png
./source/images/2011/10/eclipse-example-150x127.png
./source/images/2011/10/eclipse-example-2-300x97.png
./source/images/2011/10/eclipse-example-2.png

$ find -E . -name '*eclipse*' ! -regex '.*[0-9]+x[0-9]+.*'
./source/images/2011/10/eclipse-example.png
./source/images/2011/10/eclipse-example-2.png

(Recall that the -E option in the example above forces find to use extended regular expressions when evaluating the -regex primary.)

We may want to delete found files

While possible to use -execdir rm '{}' \; to delete files in a result set, find supports a shorter primary, -delete that accomplishes the same task. By default, -delete will not show output for each file that is removed; use the -print primary in conjunction with -delete to see which files were removed from the file system.