Unix: When pipes don't make sense

Knowing when to pipe and when not to pipe remains a sign of Unix mastery

It's been nearly 20 years since I first came across the Useless Use of Cat (UUOC) awards. Unix notable and Perl disciple Randall Schwartz had begun handing out these embarrassing awards around 1995 to people who used commands such as cat myfile | head -3 when they could more easily have used head -3 myfile. Wasting system cycles and spawning unneeded processes is not a big sin when it comes to Unix, but it's clearly both wasteful and, well, indicative of users who are either falling into a bad habit or just not paying attention.

The worse offense, using pipes when they simply don't work, is generally a sign that someone is still trying to understand how the Unix command line works. The fact is that some Unix commands are engineered to read command output that is piped to them while others are not. Recognizing the difference takes time and a bit of patience, but pipe mastery is well worth the investment.

[ Prove your expertise with the free OS in InfoWorld's Linux admin IQ test round 1 and round 2. | Track the latest trends in open source with InfoWorld's Open Sources blog and Technology: Open Source newsletter. ]

For the UUOC transgression, the basic thing to keep in mind is that many commands can work directly with files as easily as they can work with content that is piped to them. You can sort a file with sort myfile as easily as you can do the same thing with cat myfile | sort. And if minimizing your typing is a sign of Unix wisdom, well then, the first command is going to keep your fingers from tapping any more keys than necessary so bring on the smarts. In fact, there may be very few justifiable uses for commands that begin with cat myfile |. One example is when you want to do something like this -- though I can't say I've ever seen anyone actually doing this:

$ cat myfile | tee file1 file2 file3

In this command, we get a chance to view the contents of a file and redirect its content to several other files in one command. A more useful example of the "cat file |" command prefix might be this:

$ cat myfile | mailx -s "data file 12" recip11@mysite.org">recip11@mysite.org

Here we're sending the content of a file to someone's inbox. This kind of command proves useful in many situations, especially when collecting and emailing system performance or status information to yourself or a group of admins.

Most other uses of cat myfile | are probably not going to prove very clever, especially if you're sending the output to head, tail, wc, grep, awk, or other commands that are well prepared to read files without intervening pipes. An exercise that I like to give to my Intro Unix students is to have them rewrite commands such as cat /etc/passwd | grep $USER: without the pipes. "Sure," I tell them, "pipes are wonderful things. But that doesn't mean you need to use them every chance you get".

Of course, explaining to new Unix users when pipes work and when they don't is much harder than having them rewrite a list of suboptimal commands.

There are commands, such as cd, pwd, and date that just aren't going to pay attention to anything that is piped to them and commands -- cd anyway -- that don't generate any output that can be piped to other commands. You can type ls | ls and get a file listing, but that first ls is not going to be delivering its output to the second. Similarly, who | ls is going to give you the same result because ls isn't looking for content being piped to it.

Knowing when to use a pipe involves becoming familiar enough with each command to know what you can do with them. Commands such as cd, mkdir, rmdir, and touch don't generate output, nor do they read and use content that is piped to them. You can issue a command such as echo hello | touch, but you're not going to create a file named hello. Instead, you're just going to get a touch: missing file operand-type of error. Other commands, such as head, tail, more, and less can both generate output and read content that is piped to them. Then there are commands like ls and cal that do one (generate output), but not the other (read piped input) of the two pipe functions.

When you think about Unix commands and pipes this way, it's not so surprising that neophytes find the learning curve of Unix a bit steep and might actually try commands such as who | ls or tail -1 dates | cd, never mind cat myfile | wc -l. Once they fully understand when pipes work, when they don't work and when they don't contribute (and are, thus, useless), they'll be ready to work wonders on the command line.

Read more of Sandra Henry-Stocker's Unix as a Second Language blog and follow the latest IT news at ITworld, Twitter, and Facebook.


Copyright © 2013 IDG Communications, Inc.

How to choose a low-code development platform