Finding Files with Spaces in Filenames

I guess I’m too old school, I still don’t like using a blank space as a separator between words in a file name — with every opportunity I replace the blank spaces with underscores because white spaces always trigger bugs in many file handling scripts. Unless every command in the script that works with file name can account properly for blank spaces in the file name, you’re risking some funny behavior produced by your otherwise “working-fine” script. One frequent annoyance is when you’re running find on a directory full of files with names containing spaces and trying pipe the output to xargs for further processing. Assuming I’m trying to search for files containing either “foo” or “bar” strings in their contents in the directory containing the following file names:

# ls -1
My Expenses
My Trip
Things To Do

Well, running find is not looking so good with the regular arguments:

# find . -type f -print | xargs egrep "(foo|bar)"
egrep: can't open ./My
egrep: can't open Expenses
egrep: can't open ./My
egrep: can't open Trip
egrep: can't open ./Things
egrep: can't open To
egrep: can't open Do

Not very useful, since find pipes its results as a single string and xargs just breaks it into separate arguments using spaces. The way to make this work is to force the find command to delimit each of the filenames with null character and make xargs to honor these delimiters, so it becomes very simple:

# find . -type f -print0 | xargs -0 egrep "(foo|bar)"
My Expenses:foo
My Trip:bar

One big drawback for this technique is that it is not universal, if you don’t have the GNU version of find and xargs, well, you’re pretty much out of luck. Which would apply to most of us running “classic” Unix systems. The other technique is slightly more subtle using the -exec switch in find, but of course you loose some of the niceties associated with xargs:

# find . -type f -exec egrep -l "(foo|bar)" "{}" \;
./My Expenses
./My Trip

Leave a Reply

Your email address will not be published. Required fields are marked *