Seeing the World as the Shell Sees it

Chapter 07

Seeing the World as the Shell Sees it Lecture Video

In this chapter, our focus will be on uncovering the wizardry that unfolds on the command line with the press of the enter key. Exploring various intriguing and intricate aspects of the shell, we'll achieve this using only a single new command.

  • echo - Display a line of text

Expansion

Whenever you input a command line and hit enter, bash undergoes multiple processes on the text before executing your command. We've observed instances where a seemingly straightforward character sequence, like *, holds significant meaning for the shell. This transformative process is known as expansion. Through expansion, what you input is transformed into something else before the shell takes action. To illustrate, consider the echo command—a built-in shell function with a basic purpose: printing its text arguments onto standard output.

That's quite simple. Any input given to echo will be shown. Let's explore another instance:

What just occurred there? Why didn't echo display *? As we've previously learned about wildcards, the * symbol signifies matching any characters in a filename. However, what we didn't explore earlier was how the shell accomplishes this. Simply put, the shell transforms the * into something else (in this case, the names of the files in the current working directory) before executing the echo command. Upon hitting the enter key, the shell automatically expands any applicable characters in the command line before executing the command. Therefore, the echo command never encountered the *; it only received its expanded outcome. Understanding this clarifies why echo behaved as anticipated.

Pathname Expansion

Wildcards operate through a mechanism known as pathname expansion. If we apply some of the techniques used in our prior chapters, we'll observe that they essentially involve expansions. Considering a home directory structured like this:

we could carry out the following expansions:

and:

or even:

and looking beyond our home directory:

Pathname Expansion Of Hidden Files

As we're aware, files starting with a period are hidden, and this behavior is respected by pathname expansion. When we perform an expansion like:

echo *

it doesn't display hidden files.

Initially, it might seem possible to incorporate hidden files in an expansion by commencing the pattern with a leading period, as shown here:

echo .*

It's nearly effective. Yet, upon closer inspection of the outcomes, we'll notice that the names “.” and “..” are included. These names point to the current working directory and its parent directory, potentially leading to an inaccurate result if this pattern is used. To illustrate, let's try this command:

ls -d .* | less

For more accurate pathname expansion in this scenario, we need to utilize a more precise pattern:

echo .[!.]*

This pattern encompasses filenames that start with a single period, followed by any characters but not a second period. It accurately encompasses most hidden files (although it still excludes filenames with multiple leading periods). Using the 'ls' command with the '-A' option ("almost all") will generate an accurate listing of hidden files:

ls -A

Tilde Expansion

As we discussed during our introduction to the cd command, the tilde character (~) holds a unique significance. When placed at the start of a word, it expands to represent the home directory of the specified user or, if no user is named, the home directory of the current user:

If user foo has an account, then:

Arithmetic Expansion

Expansion in the shell permits arithmetic operations, enabling us to utilize the shell prompt as a calculator:

Arithmetic expansion uses the form:

$((expression))

where the expression comprises values and arithmetic operators to form an arithmetic expression.

Arithmetic expansion exclusively handles integers (whole numbers without decimals) but can execute various operations. Below are some of the supported operators:

Operator
Description

+

Addition

-

Subtraction

*

Multiplication

/

Division (However, it's important to note that expansion's support for integer arithmetic means the results are also integers.)

%

Modulo, which simply means, “remainder.”

**

Exponentiation

Spaces hold no significance within arithmetic expressions, and these expressions can be nested. For instance, to calculate the product of 5 squared by 3:

Using single parentheses allows for the grouping of multiple subexpressions. Employing this method, we can revise the previous example and achieve the same result with a single expansion instead of two:

Here's an illustration employing the division and remainder operators. Observe the outcome of integer division:

Arithmetic expansion is covered in greater detail in Chapter 34.

Brace Expansion

One of the more peculiar expansions is known as brace expansion. Through this, you can generate multiple text strings from a pattern that includes braces. Take a look at this example:

Patterns eligible for brace expansion might consist of a leading part known as a preamble and a trailing section termed a postscript. Within the brace expression, there could be a comma-separated string list or a range of integers or single characters. The pattern must not include spaces. Take a look at this example featuring a range of integers:

Integers may also be zero-padded like so:

A range of letters in reverse order:

Brace expansions may be nested:

What purpose does this serve? The primary use is creating lists of files or directories to be generated. Consider a scenario where, as photographers, we possess an extensive collection of images that require organization by years and months. Our initial step might involve creating a sequence of directories labeled in a numeric “Year-Month” structure to ensure chronological sorting. While manually listing all directories is laborious and error-prone, there's a more efficient way to accomplish this:

Pretty cool right?!

Parameter Expansion

In this chapter, we'll briefly introduce parameter expansion, but we'll delve deeper into it later on. This feature is particularly valuable within shell scripts rather than directly on the command line. Its capabilities revolve around the system's capacity to store small data portions and assign each a specific name—commonly referred to as variables. Numerous such variables are accessible for exploration. For instance, the USER variable holds your username. To employ parameter expansion and display the contents of USER, you would execute this command:

To see a list of available variables, try this:

While working with other expansion types, mistyping a pattern prevents the expansion, leading the echo command to display the mistyped pattern. However, in parameter expansion, if you misspell a variable name, the expansion proceeds but results in an empty string instead:

Command Substitution

Command substitution enables us to utilize the output of a command for expansion:

One of my favorites goes something like this:

In this instance, we utilized the output of which cp as an argument for the ls command, obtaining the listing of the cp program without requiring its complete pathname. This method isn't restricted to basic commands; entire pipelines can also be employed (only a partial output is displayed):

In this instance, the argument list for the file command was populated by the results of the pipeline.

Older shell programs and bash support an alternative syntax for command substitution, using back-quotes (`) instead of the dollar sign and parentheses:

Quoting

Now that we've explored the numerous ways the shell executes expansions, it's essential to understand how we can manage this process. Consider, for instance:

or:

In the initial case, the shell's word-splitting action removed excessive whitespace from the echo command's argument list. In the subsequent example, parameter expansion replaced an empty string for the value of $1 due to it being an undefined variable. To selectively suppress undesired expansions, the shell offers a mechanism known as quoting.

Double Quotes

The initial form of quoting we'll explore involves double quotes. When you enclose text within double quotes, the special characters used by the shell lose their unique significance and are treated as regular characters. Exceptions to this rule are $, \ (backslash), and ` (back-quote). This implies that word-splitting, pathname expansion, tilde expansion, and brace expansion are suppressed, while parameter expansion, arithmetic expansion, and command substitution still occur. With double quotes, managing filenames containing spaces becomes feasible. For instance, imagine having a file named two words.txt. If used on the command line without quotes, word-splitting would treat it as two separate arguments instead of a single desired argument:

Employing double quotes prevents word-splitting and yields the desired outcome. Moreover, we can even rectify the resulting issue:

There you go! No need to repeatedly type those bothersome double quotes anymore.

Keep in mind that within double quotes, parameter expansion, arithmetic expansion, and command substitution remain active:

Let's examine the impact of double quotes on command substitution for a moment. Before that, it's beneficial to explore how word splitting operates more intricately. In our previous example, we observed how word splitting seemingly eliminates additional spaces in our text:

By default, word splitting identifies spaces, tabs, and newlines as delimiters between words. Unquoted spaces, tabs, and newlines are treated as separators rather than part of the text. Consequently, they divide words into separate arguments. In our command line example, there's a command followed by four distinct arguments due to this separation. However, if we introduce double quotes:

word-splitting is suppressed and the embedded spaces are no longer considered delimiters; instead, they become part of the argument. Upon adding double quotes, our command line now consists of a command followed by a single argument.

The treatment of newlines as delimiters in the word-splitting mechanism leads to an intriguing yet subtle impact on command substitution. Take this into consideration:

Initially, the unquoted command substitution led to a command line comprising 38 arguments. However, in the second instance, it generated a command line with only one argument that encompassed both embedded spaces and newlines.

Single Quotes

When we require the suppression of all expansions, single quotes come into play. Here's a comparison between unquoted, double quotes, and single quotes:

Notice how as we progress through each level of quoting, an increasing number of expansions get suppressed.

Escaping Characters

Occasionally, we might only need to quote a single character. To achieve this, we can place a backslash before the character, utilizing what's known as the escape character in this context. Typically, this is done within double quotes to selectively prevent an expansion:

Using escaping is a common practice to nullify the special significance of a character within a filename. For instance, it's feasible to incorporate characters in filenames that typically hold special meanings to the shell, such as $, !, &, spaces, and others. To include a special character in a filename, you can achieve this by:

To include a backslash character, escape it by typing \. It's important to note that within single quotes, the backslash loses its special significance and is interpreted as an ordinary character.

Backslash Escape Sequences

Apart from its function as the escape character, the backslash is employed as part of a notation representing specific special characters known as control codes. The initial 32 characters in the ASCII coding scheme are utilized to transmit commands to teletype-like devices. While some of these codes are familiar (like tab, backspace, linefeed, and carriage return), others may not be (such as null, end-of-transmission, and acknowledge).

Escape Sequence
Meaning

\a

Bell (“Alert” - causes the computer to beep)

\b

Backspace

\n

Newline. On Unix-like systems, this produces a linefeed.

\r

Carriage return

\t

Tab

The provided table illustrates several prevalent backslash escape sequences. This representation using the backslash originated in the C programming language and has been adopted by numerous others, including the shell. When you add the -e option to echo, it activates the interpretation of escape sequences. Alternatively, you can enclose them within $' '. To demonstrate, consider using the sleep command—a basic program that waits for the specified number of seconds and then exits—to create a simple countdown timer:

We could also do this

Summary

As we progress in working with the shell, we'll notice expansions and quoting becoming more prevalent. Hence, it's valuable to gain a strong comprehension of their functionality. In fact, some might argue that these are among the most crucial topics to grasp regarding the shell. Without a comprehensive understanding of expansion, the shell might always seem enigmatic and puzzling, leading to untapped potential and wasted power.

Last updated