Redirection

Chapter 06

Redirection Lecture Video

In this lesson, we will unveil one of the most fascinating features of the command line: I/O redirection. The term "I/O" refers to input and output, and this functionality allows you to redirect the input and output of commands to and from files, as well as connect multiple commands together to form robust command pipelines. To demonstrate the power of this feature, we will introduce the following commands:

  • cat - Concatenate files

  • sort - Sort lines of text

  • uniq - Report or omit repeated lines

  • grep - Print lines matching a pattern

  • wc - Print newline, word, and byte counts for each file

  • head - Output the first part of a file

  • tail - Output the last part of a file

  • tee - Read from standard input and write to standard output and files

Standard Input, Output, And Error

Throughout our usage of various programs thus far, we have observed that they generate different types of output. This output generally comprises two categories. Firstly, we have the program's results, which refer to the intended data produced by the program. Secondly, we encounter status and error messages that provide information on the program's execution. Taking the ls command as an example, we can observe that it presents both its results and error messages directly on the screen.

Continuing with the Unix principle of "everything is a file," programs like ls utilize a unique file known as standard output (commonly referred to as stdout) to send their results, while their status messages are directed to another file called standard error (stderr). By default, both standard output and standard error are connected to the screen and aren't saved into a disk file.

Furthermore, numerous programs accept input from a component known as standard input (stdin), which is typically linked to the keyboard by default.

I/O redirection empowers us to modify the destinations of output and sources of input. Typically, output is directed to the screen, while input originates from the keyboard. However, through I/O redirection, we can alter these default behaviors.

Redirecting Standard Output

I/O redirection provides us with the ability to redefine the destination of standard output. To redirect standard output from the screen to another file, we employ the > redirection operator, followed by the desired file name. This functionality proves useful in scenarios where we wish to store the output of a command in a file. For instance, we can instruct the shell to send the output of the ls command to a file named ls-output.txt instead of displaying it on the screen.

In this case, we generated a detailed listing of the /usr/bin directory and directed the output to the file named ls-output.txt. Now, let's inspect the output that was redirected by the command:

Great! We now have a substantial text file. If we examine the contents of the file using the less command, we can confirm that the ls-output.txt file does, in fact, contain the output of our ls command.

Now, let's repeat our redirection test with a twist. This time, we will modify the directory name to one that does not exist:

We encountered an error message, which is expected since we specified a non-existent directory, /bin/usr. However, the error message was displayed on the screen instead of being redirected to the ls-output.txt file. This is because the ls program sends its error messages to standard error (stderr), not standard output (stdout). As we only redirected standard output and not standard error, the error message was still displayed on the screen. Now, let's examine the contents of our output file:

The file now has a size of zero! This occurred because when we use the > redirection operator to redirect output, the destination file is always overwritten from the beginning. In this case, since our ls command produced no results except for an error message, the redirection operation started to overwrite the file but stopped due to the encountered error, resulting in the file being truncated. Interestingly, if we ever need to deliberately truncate a file or create a new empty file, we can employ a trick like this:

By using the redirection operator without any preceding command, we can truncate an existing file or create a new, empty file.

To append redirected output to a file instead of overwriting it from the beginning, we can use the >> redirection operator. Here's an example:

By using the >> operator, the output will be appended to the file. If the file doesn't exist, it will be created, similar to how the > operator works. Let's test it out:

By executing the command multiple times, we appended the output to the file, making it three times larger.

Redirecting Standard Error

To redirect standard error, we need to use the file descriptor number. While we commonly refer to the first three file streams as standard input, output, and error, the shell internally uses file descriptors 0, 1, and 2 to represent them, respectively. To redirect standard error, which corresponds to file descriptor 2, we can use the file descriptor notation like this:

To redirect standard error to a file, we can use the file descriptor "2" right before the redirection operator. This will direct any error messages to the file ls-error.txt.

Redirecting Standard Output And Standard Error To One File

Sometimes, we might want to gather all the output from a command into one file. To achieve this, we need to redirect both standard output and standard error simultaneously. There are two methods for this: the classic approach, compatible with older shell versions, goes like this:

With this technique, we execute two redirections. Initially, we redirect the standard output to the file ls-output.txt, followed by redirecting file descriptor 2 (standard error) to file descriptor 1 (standard output) using the syntax 2>&1.

It's crucial to consider the sequence of redirections. Redirecting standard error must follow the redirection of standard output, or it won't function properly. In the given example,

> ls-output.txt 2>&1

directs standard error to the file ls-output.txt. However, altering the order to

2>&1 > ls-output.txt

will send standard error output to the screen.

Newer iterations of bash offer an alternative, more efficient approach for executing this combined redirection:

Here, we employ the single notation &> to redirect both standard output and standard error to the file ls-output.txt. You can also append the streams of standard output and standard error to a single file as follows:

Getting Rid of Unnecessary Output

Occasionally, we prefer silence over output when executing a command, especially for error and status messages. The system offers a method for achieving this by redirecting output to a unique file named /dev/null. This file serves as a system device called a bit bucket, essentially discarding any input it receives. To mute error messages from a command, follow these steps:

/dev/null In Unix Culture

The concept of the bit bucket originates from Unix's early days and has pervaded various aspects of Unix culture due to its widespread use. When someone mentions sending your comments to /dev/null, it refers to discarding them. For further illustrations, refer to the Wikipedia article on /dev/null.

Redirecting Standard Input

So far, we haven't come across any commands utilizing standard input (though, truthfully, we have - but we'll unveil that surprise shortly), prompting the need to introduce one now.

cat – Concatenate Files

The cat command reads one or multiple files and duplicates their content to standard output, demonstrated as follows:

In many instances, you can equate cat to the TYPE command in DOS. It allows you to view files without pagination, such as:

will show what's inside the file ls-output.txt. cat is commonly used for viewing concise text files. Because cat can take multiple files as input, it's also handy for merging files. Imagine we've downloaded a large file split into several parts (common in Usenet for multimedia files), and let's assume the files were named:

movie.mpeg.001 movie.mpeg.002 ... movie.mpeg.099

we could join them back together with this command:

Because wildcards consistently expand in a sorted manner, the arguments will automatically arrange themselves in the correct order.

That's all good, but how does this relate to standard input? Not much at this point. Let's explore further. What occurs if we input cat without any arguments:

Nothing seems to occur; it simply remains idle, appearing as if it's frozen. However, despite this appearance, it's actually functioning precisely as intended.

When cat lacks arguments, it reads from standard input, which, by default, is linked to the keyboard, waiting for our input! Let's add the text below and press Enter:

Afterward, input a Ctrl-d (meaning, hold the Ctrl key and press d) to signal to cat that it has reached the end of the file (EOF) on standard input:

When lacking filename arguments, cat duplicates standard input to standard output, resulting in the repetition of our entered text. This functionality allows us to craft brief text files. Suppose we aim to generate a file named lazy_dog.txt with the example text. To achieve this, we would execute the following command:

Enter the command and input the text you wish to place in the file. Don't forget to conclude with Ctrl-d. Using the command line, we've essentially created the simplest word processor! To review our work, we can utilize cat to duplicate the file content to stdout once more:

Having grasped how cat handles standard input, along with filename arguments, let's experiment with redirecting standard input:

Through the < redirection operator, we alter the standard input's origin from the keyboard to the file lazy_dog.txt. The outcome mirrors that of using a single filename argument. While not especially advantageous compared to passing a filename argument, this showcases employing a file as the source of standard input. However, other commands leverage standard input more effectively, as we'll soon discover.

Before proceeding further, take a look at the cat command's manual page, as it contains several intriguing options worth exploring.

Pipelines

Commands leverage a shell feature known as pipelines, which harnesses the ability to read data from standard input and transfer it to standard output. This is accomplished using the pipe operator | (vertical bar), allowing the standard output of one command to flow into the standard input of another:

To illustrate this thoroughly, we'll require several commands. Recall our mention of a command that already accepts standard input? That's less. We can employ less to display, page-by-page, the output of any command that directs its results to standard output:

This method is incredibly useful! Through this approach, we can conveniently inspect the output of any command that generates standard output.

The Difference Between > and |

Initially, distinguishing between the redirection executed by the pipeline operator "|" and the redirection operator ">" might seem challenging. In essence, the redirection operator links a command to a file, whereas the pipeline operator connects the output of one command to the input of another command.

Many individuals experiment with the following when learning about pipelines, just to observe the outcome.

Answer: Sometimes something really bad.

Here's a real example shared by a reader who was managing a Linux-based server appliance. In their capacity as the superuser, they executed the following:

The initial command led to the directory housing most programs, while the subsequent command instructed the shell to replace the file less with the output of the ls command. Given that the /usr/bin directory already held a file named less (representing the less program), the second command replaced the less program file with the text from ls, thereby erasing the less program from his system.

The key lesson is that the redirection operator can quietly create or overwrite files, emphasizing the need for careful consideration and respect when using it.

Filters

Pipelines frequently serve for executing intricate operations on data, allowing multiple commands to be linked together. Often, these commands are termed as filters. Filters take input, modify it in some manner, and produce an output. Let's start with sort as the initial filter. Suppose we aim to create a consolidated list of all executable programs in /bin and /usr/bin, sort them, and display the resulting list:

Having specified two directories (/bin and /usr/bin), the output of ls would have comprised two sorted lists, one for each directory. Through the inclusion of sort in our pipeline, we transformed the data to generate a unified, sorted list.

uniq - Report Or Omit Repeated Lines

The uniq command is frequently paired with sort. uniq takes a sorted data list from either standard input or a single filename argument (refer to the uniq manual page for specifics) and, by default, eliminates any duplicate entries. Therefore, to ensure our list remains free of duplicates (i.e., any programs with identical names appearing in both the /bin and /usr/bin directories), we'll incorporate uniq into our pipeline:

In this instance, uniq is employed to eliminate duplicates from the result produced by the sort command. Should we desire to display the list of duplicates instead, we include the -d option with uniq, as demonstrated below:

wc – Print Line, Word, And Byte Counts

The wc (word count) command is utilized to showcase the count of lines, words, and bytes present in files. For instance:

Here, it displays three figures: lines, words, and bytes within ls-output.txt. Similar to our prior commands, when executed without command line arguments, wc considers standard input. The -l option restricts its output solely to line counts. Including it within a pipeline proves useful for counting. To ascertain the count of items within our sorted list, we can execute the following:

grep – Print Lines Matching A Pattern

grep is a robust tool employed to locate text patterns within files. Its usage is exemplified below:

Upon encountering a "pattern" in the file, grep displays the lines containing that specific pattern. Although grep can handle intricate patterns, let's focus on straightforward text matches for now. We'll delve into more advanced patterns, known as regular expressions, in a subsequent chapter.

There are a couple of useful options available for grep: -i disregards case sensitivity during the search (typically, searches are case sensitive), and -v instructs grep to exclusively display lines that do not match the specified pattern.

head / tail – Print First / Last Part Of Files

Occasionally, you might not require the entire output from a command. You might only need the initial few lines or the concluding few lines. head displays the first ten lines of a file, while tail exhibits the last ten lines. By default, both commands present ten lines of text, but this can be altered using the -n option:

These can be used in pipelines as well:

The tail command offers an option enabling real-time viewing of files, particularly handy for monitoring the progression of log files during writing. In the upcoming example, we'll observe the messages file within /var/log (or the /var/log/syslog file if messages isn't present). Note that accessing this may necessitate superuser privileges on certain Linux distributions, as the /var/log/messages file might contain security-related data:

Employing the -f option, tail continuously observes the file, instantly displaying newly appended lines on the screen. This process persists until you input Ctrl-c.

tee – Read From Stdin And Output To Stdout And Files

In line with our plumbing analogy, Linux offers a command named tee, acting as a tee fitting on our pipeline. The tee program reads standard input, duplicating it to both standard output (enabling the data to proceed through the pipeline) and to one or multiple files. This proves beneficial for capturing the content within a pipeline at an intermediate processing stage. Below, we revisit a previous example, this time incorporating tee to save the complete directory listing into the file ls.txt before grep filters the pipeline's content:

Summary

As usual, explore the documentation for each command covered in this chapter. We've only touched upon their fundamental usage, but they boast numerous intriguing options. With more Linux experience, you'll discover the command line's redirection feature to be incredibly beneficial for tackling specific issues. Many commands utilize standard input and output, while nearly all command line programs rely on standard error to showcase their informative messages.

Last updated