The HP-UX Admin Man: Confessions and Clarifications

Fred has a hard time remembering anything technical after his vacation, so to ease back into the swing of things, he answers some e-mail inquiries.

I’ve discovered that I must have had a good vacation. I can’t seem to remember anything technical – I think one less beer per day would have prevented this temporary amnesia. Next year, maybe I’ll have two more beers per day and see if all of reality will go away. Until then, I’ll ease back into the saddle and answer some of my e-mail inquiries.

I received some questions about the series of columns discussing the find command (HP Professional, October 1999 to December 1999). I often spout excitedly about the topics I am writing about, so it is no wonder someone tried to use find for something it was not intended. The question was: How do I make find stop at one level of the directory structure?

Well, if you only want to develop a list of, or act on names at, a single level, find is not the tool for you. If you extract the first five words in the definition of find from the main page, it says something like: "find recursively descends the directory..." If you want to act only on names from one level, shell wildcards can be used effectively to develop a list:

filename=widget

echo ${filename}.*

In the example above, the commands in a korn shell script would return all names that started with ‘widget’ in the current directory. You would then have to loop on that ‘list’ of names and test to see which ones met the other qualification needed. If there are many other qualifications, find really is a good tool for this (that or else a Perl program). To use find without dropping into subdirectories, you could use the –path option, supplying a path that would match any time you crossed into a subdirectory (meaning it contains two slashes), then negating those paths:

find. ! –path ‘./*/*’ -

name ‘widget.*’

-type f -perm –1 -print

The command above will list (–print) those items in the current directory ( . ! –path ‘./*/*’), whose names start with ‘widget.’ -name ‘widget.*’, are files (–type f), and have execute rights on them (-perm –1). This is one way to limit recursion in find.

A much simpler question came in about the –print option: Can you think of any examples where omitting the –print does not automatically print the desired results?

HP-UX always does a default –print at the end of the expression if it is omitted. This is not true for many other versions of find.

A much more difficult question came in about performance: I am hoping that you will briefly discuss performance with the find command. For example, isn’t it true that files are "filtered" left-to-right, such that there could be a performance difference between these two commands:

find / -type d -name tasks -print

find / -name tasks -type d -print

I had heard that the first command will search the entire system, extracting all of the directories, then extract from -that- list, the those named ‘tasks.’ In the second, only the files named ‘tasks’ are checked to see if they are directories.

The arguments to the find command form a Boolean expression. As such, it is indeed read from left to right, however, find does not build a list that matches the first item in the argument list, then run that list through the second element.

What actually happens is that each object in the directory structure that find is searching in is applied across the entire Boolean expression. Once it is finished, the next object goes through the entire expression, and the next one, etc.

If we rewrite the first of the examples from the e-mail to include the default ‘and’ operator between each Boolean argument, it looks like this:

find / -type d -a -name tasks

-a -print

Now it is easier to see that there are three tests being performed (or you might want to think of it as two tests and an action to be performed). In order for the last test to be performed (the -print), the first two tests must return true. If any tests return false, then the entire Boolean expression fails, the current object is ‘discarded’ and the next object in the directory structure is run through the set of tests (the Boolean expression).

In this example, the first object that find tests from the / directory is analyzed to see if it is of type directory; if not, then the second object from the / directory is examined. If the second object is of type directory, find then checks to see if it is named ‘tasks;’ if not, then this object is rejected, and the third object starts through the expression. If the third object is named ‘tasks,’ then the next test is performed.

This ‘test’ of the expression is more like an action, in that it prints the pathname of the current object. In contrast, the –print option is indeed a test – one that always returns true – thus it can never fail, and can be used anywhere in an expression. Consequently, there are times when adding a few –print options helps in troubleshooting a find command line.

Since the argument list of find is a Boolean expression, the speed does indeed vary by the order the command lines are written in. Using the examples from above:

find / -type d -name tasks -print

find / -name tasks -type d –print

If we assume that there are far more directories than objects named ‘tasks,’ it would be faster to use the second example. Any object not named ‘tasks’ would fail the –name test, thus the –type test need not be performed, saving a ‘stat’ operation on any object that could not pass the whole expression anyway. In general, it is faster to perform name tests before any tests that require a stat operation.

Not only did I clear my e-mail inbox a little, but the old tech side of the brain is starting to warm up. In fact, I even remember something I meant to put in the vi series:

What always amazes me is that so many people will save the source and exit vi, just to test the program. If you are writing in something like shell scripting or Perl, it is much easier to just issue the following from within vi:

:w !%

The :w performs a write to save current changes, and the ! means invoke the following as a command. % is a shortcut for the current filename. Thus !% means execute the file we are currently editing. The output of the program appears on the screen, and you just hit return to continue editing. The drawback is that standard input is not waited on, so you must put any required input test data in a file, and redirect input on the ex line. For example, if we put some input data into a file named ‘input,’ the following would tell the currently edited program to take standard input from that file:

:w !% <input

Note that there is a space between the w and the ! in all of these examples. If there is no space, vi (well, ex, actually) thinks that we are telling it to forcible write to the file (:w!), instead of invoking a command.