Storage Media

Chapter 15

In preceding sections, we've explored the manipulation of data on a file level. This chapter delves into handling data at the device level within Linux. Linux offers robust capabilities in managing various storage devices, including physical ones like hard disks, network storage, and virtual storage devices such as RAID and LVM.

However, given the scope of this book, which isn't centered on system administration, a comprehensive exploration isn't feasible. Instead, the focus is on introducing key concepts and essential commands used in storage device management. To facilitate the exercises in this chapter, we'll utilize a USB flash drive, a CD-RW disc (where systems have a CD-ROM burner), and a floppy disk (if the system supports it).

The commands covered in this chapter include:

  • mount: for mounting a file system

  • umount: to unmount a file system

  • fsck: used to check and repair a file system

  • fdisk: for manipulating partition tables

  • mkfs: for creating a file system

  • fdformat: used in formatting a floppy disk

  • dd: for directly writing block-oriented data to a device

  • genisoimage (mkisofs): used in creating an ISO 9660 image file

  • wodim (cdrecord): for writing data to optical storage media

  • md5sum: used to calculate an MD5 checksum

Mounting And Unmounting Storage Devices

Recent strides in the Linux desktop have simplified storage device management significantly for desktop users. Nowadays, when we connect a device to our system, it typically functions seamlessly without requiring much intervention. Contrast this with the earlier days, around 2004, when these tasks often demanded manual handling. However, on non-desktop systems, particularly servers, the process remains predominantly manual due to the intricate storage demands and complex configurations they often entail.

The initial stage in managing a storage device involves integrating it into the file system tree—a process known as mounting. This integration enables the device to interact within the operating system. As we learned in Chapter 2, Unix-like systems, including Linux, maintain a unified file system tree where devices link at various points. This differs from operating systems like MS-DOS and Windows, which uphold separate file system trees for individual devices (like C:\, D:\, etc.).

The file /etc/fstab contains a list of devices (usually hard disk partitions) designated for mounting during system boot. Here's an example of an /etc/fstab file from a Fedora 7 system:

Among the file systems detailed in this sample file, the majority are virtual and fall beyond the scope of our current discussion. Of particular relevance to our purposes are the initial three:

These are the hard disk partitions. Each line of the file consists of six fields, as follows:

Field
Contents
Description

1

Device

Historically, this field typically contained the explicit name of a device file linked to the physical device—such as /dev/hda1 (representing the first partition of the master device on the initial IDE channel). However, in contemporary computers featuring numerous hot-pluggable devices like USB drives, modern Linux distributions often associate a device with a textual label instead. This label, added to the storage media during formatting, is interpreted by the operating system upon device attachment. Consequently, irrespective of the assigned device file for the physical device, accurate identification remains consistent.

2

Mount Point

The directory where the device is attached to the file system tree.

3

File System Type

Linux supports the mounting of numerous file system types. While several native Linux file systems operate on ext3, the system extends support to various others, including FAT16 (msdos), FAT32 (vfat), NTFS (ntfs), CD-ROM (iso9660), and more.

4

Options

Mounting file systems offers diverse options. For instance, it's feasible to mount file systems in a read-only mode or restrict program execution—a beneficial security measure, especially for removable media.

5

Frequency

A single number that specifies if and when a file system is to be backed up with the dump command.

6

Order

A single number that specifies in what order file systems should be checked with the fsck command.

Viewing A List Of Mounted File Systems

The mount command serves to attach file systems. Executing the command without any arguments will present a roster of the presently mounted file systems.

The listing format includes: device linked to mount_point, denoting the file_system_type along with associated options. For instance, the initial line displays that the device /dev/sda2 is mounted as the root file system, operating as ext3 and allowing both read and write actions (noted by the rw option). Notably, two intriguing entries appear toward the end of the list. The penultimate entry exhibits a 2-gigabyte SD memory card within a card reader, mounted at /media/disk, while the final entry represents a network drive linked to /misc/musicbox.

To commence our initial trial, let's observe a system state without a CD-ROM inserted:

The displayed information pertains to a CentOS 5 system employing LVM (Logical Volume Manager) for its root file system. Similar to numerous contemporary Linux distributions, this system endeavors to automatically mount the CD-ROM upon insertion. Subsequent to inserting the disc, the following details become visible:

Upon inserting the disc, the listing remains consistent with an additional entry appended at the end. The new entry indicates that the CD-ROM (identified as device /dev/hdc on this particular system) has been mounted on /media/live-1.0.10-8, labeled as iso9660 (representing a CD-ROM). For our experiment's purpose, we aim to note the device's name. Keep in mind, the device name may differ when conducting this experiment independently.

Warning

In the examples that follow, it is vitally important that you pay close attention to the actual device names in use on your system and do not use the names used in this text!

Also note that audio CDs are not the same as CD-ROMs. Audio CDs do not contain file systems and thus cannot be mounted in the usual sense.

Now that we have identified the CD-ROM drive's device name, let's proceed by unmounting the disc and subsequently mounting it at a different location within the file system tree. To execute this, we elevate privileges to superuser (using the appropriate command for our system) and unmount the disc utilizing the umount command (observe the spelling):

Next, let's proceed by establishing a fresh mount point for the disk. A mount point essentially constitutes a directory situated within the file system tree. It's a standard directory without any special attributes. Interestingly, it doesn't necessarily have to be an empty directory. However, if you mount a device onto a non-empty directory, you won't be able to view the directory's previous contents until you unmount the device. For our current objectives, we'll create a new directory:

Ultimately, we proceed to mount the CD-ROM at the newly designated mount point. The utilization of the -t option allows us to specify the file system type:

Afterward, we can examine the contents of the CD-ROM via the new mount point:

Notice what happens when we try to unmount the CD-ROM:

What's the rationale behind this? It's because we're unable to unmount a device if it's currently in use by a process or user. In this scenario, shifting our active directory to the CD-ROM's mount point results in the device being occupied. Fortunately, resolving this predicament is straightforward—we simply need to change the active directory to a location other than the mount point:

Now the device unmounts successfully.

Why Unmounting Is Important

Looking at the output of the free command, which presents memory usage stats, you'll notice a metric called "buffers." Systems are designed for optimal speed. Slow devices, like printers, present a challenge as they operate much slower compared to computers. In the past, especially in the early days of PCs before multitasking, printing posed a genuine slowdown. Tasks would pause whenever printing commenced, causing inconvenience. To mitigate this, the printer buffer emerged—a device with its own RAM memory positioned between the computer and printer.

The concept of buffering is widely employed in computers to enhance speed. It ensures that occasional interactions with sluggish devices don't hinder overall system performance. Operating systems leverage memory to temporarily store data read from or destined for storage devices before actual engagement with the slower device. For instance, in Linux, memory usage seemingly increases over time, signifying the system's optimization in utilizing available memory for buffering.

This buffering mechanism accelerates writing to storage devices by deferring the physical writing to a later point, while data accumulates in memory. Periodically, the OS transfers this data to the physical device. When unmounting a device, all remaining data must be written to it for safe removal. Removing a device without unmounting it risks incomplete data transfers, potentially causing critical directory updates and file system corruption—among the worst scenarios for a computer.

Determining Device Names

Determining the name of a device can be challenging at times. In the past, it wasn't particularly difficult as devices maintained fixed positions and didn't alter—a preference Unix-like systems favored. During the era of Unix's development, "changing a disk drive" meant physically removing a massive device akin to a washing machine from the computer room using a forklift. However, as desktop hardware configurations evolved dynamically in recent years, Linux adapted, becoming more adaptable than its predecessors.

In the previously illustrated examples, we capitalized on the modern Linux desktop's capacity to automatically mount devices and subsequently identify their names. But what if we're dealing with a server or an environment where this automatic process doesn't occur? How do we navigate this scenario?

Let's begin by examining how the system assigns names to devices. By listing the contents of the /dev directory—the residence of all devices—we encounter numerous devices:

The contents of this listing reveal some patterns of device naming. Here are a few:

Pattern
Device

/dev/fd*

Floppy disk drives.

/dev/hd*

Older systems commonly employed IDE (PATA) disks. Motherboards typically accommodated two IDE connectors or channels, each furnished with a cable featuring two attachment points for drives. The initial drive on the cable is termed the master device, while the subsequent one is known as the slave device. Device names follow a specific order: /dev/hda represents the master device on the first channel, /dev/hdb denotes the slave device on the first channel, /dev/hdc pertains to the master device on the second channel, and so forth. Additionally, a trailing digit in the name signifies the partition number on the device. For instance, /dev/hda1 refers to the first partition on the primary hard drive in the system, whereas /dev/hda signifies the entire drive.

/dev/lp*

Printers.

/dev/sd*

For modern Linux systems, the kernel categorizes all disk-like devices—ranging from PATA/SATA hard disks, flash drives, to USB mass storage devices like portable music players and digital cameras—as SCSI disks. The naming convention for these devices aligns with the structure reminiscent of the earlier /dev/hd* naming system detailed previously.

/dev/sr*

Optical drives (CD/DVD readers and burners).

Furthermore, symbolic links like /dev/cdrom, /dev/dvd, and /dev/floppy are commonly observed, designed to conveniently direct to the actual device files.

In cases where a system doesn't automatically mount removable devices, a technique can be employed to ascertain the name assigned to the removable device upon attachment. Initially, commence a live monitoring of the /var/log/messages or /var/log/syslog file in real-time (superuser privileges might be necessary for this):

The file will show the last few lines and then halt. Subsequently, connect the removable device—let's say, for instance, a 16 MB flash drive. The kernel will promptly detect the device and initiate probing:

Once the display halts again, press Ctrl-c to regain the prompt. The notable segments of the output revolve around the recurrent mentions of [sdb], aligning with our anticipation of a SCSI disk device name. With this knowledge, two lines stand out as particularly enlightening:

This revelation unveils the device name as /dev/sdb for the complete device and /dev/sdb1 for its initial partition. As we've experienced, navigating Linux often involves engaging in intriguing detective work!

Tip

Using the tail -f /var/log/messages technique is a great way to watch what the system is doing in near real-time.

With our device name in hand, we can now mount the flash drive:

The device name will persist as long as it stays physically connected to the computer without requiring a reboot.

Creating New File Systems

Suppose we aim to reformat the flash drive, replacing its current FAT32 system with a Linux-native file system. This process involves two primary steps: 1. Optionally establishing a new partition layout if the existing one doesn't meet our preferences, and 2. Generating a fresh, vacant file system on the drive.

Warning

Manipulating Partitions With fdisk

The fdisk utility empowers us to directly engage with disk-like devices, including hard disk drives and flash drives, at an intricate level. This tool enables editing, deletion, and creation of partitions on the device. To manipulate our flash drive, we'll start by unmounting it (if necessary) and then access the fdisk program in the following manner:

Take note that it's essential to specify the device in terms of the entire device, rather than by partition number. Once the program initializes, we'll encounter the subsequent prompt:

Entering an m will display the program menu:

Initially, our aim is to inspect the current partition layout. This involves entering p to print the partition table specifically for the device:

In this instance, we observe a 16 MB device featuring a solitary partition (1) that consumes 1006 out of the available 1008 cylinders on the device. This partition is labeled as a Windows 95 FAT32 partition. Occasionally, certain programs use this identification to restrict disk operations, yet it's typically not crucial to modify it. Nevertheless, for demonstration purposes, we'll modify it to indicate a Linux partition. Initially, we need to identify the ID used for specifying a Linux partition. Referring back to the previous listing, we note that the ID b designates the existing partition. To explore a list of available partition types, we refer to the program menu, where the following choice is presented:

Entering l at the prompt reveals an extensive list of potential types, including b for our current partition type and 83 designated for Linux.

Upon revisiting the menu, we encounter the option to modify a partition ID:

We enter t at the prompt enter the new ID:

All the necessary modifications have now been implemented. Until this moment, no alterations have been applied to the device itself—all changes have been stored in memory. To finalize the process, we'll write the modified partition table to the physical device and exit by entering w at the prompt:

Had we chosen to retain the device in its original state, inputting q at the prompt would have exited the program without applying any modifications. The somewhat foreboding warning message can be disregarded safely.

Creating A New File System With mkfs

Having completed our partition editing, however brief it was, it's now time to establish a new file system on our flash drive. For this task, we'll employ mkfs (short for "make file system"), a tool capable of generating file systems in diverse formats. Creating an ext3 file system on the device entails utilizing the -t option to specify the ext3 system type, followed by the designation of the device containing the partition slated for formatting:

Selecting ext3 as the file system type prompts the program to present a substantial amount of information. If you intend to revert the device to its original FAT32 file system, indicate vfat as the specified file system type:

The steps involved in partitioning and formatting can be replicated whenever new storage devices are introduced to the system. Although we utilized a small flash drive in our demonstration, this identical process is applicable to internal hard drives and other detachable storage devices like USB hard drives.

Testing And Repairing File Systems

In our previous discussion about the /etc/fstab file, we encountered cryptic numbers at the end of each line. During system boot-up, a routine process examines the integrity of file systems before mounting them. This task is carried out by the fsck program, an abbreviation for "file system check." The final number in each fstab entry dictates the sequence for checking the devices. In the aforementioned example, the root file system undergoes the initial check, followed by the home and boot file systems. Devices marked with a zero as the final digit bypass routine checks.

Beyond assessing file system integrity, fsck also attempts to rectify corrupted file systems, albeit success rates may vary depending on the level of damage. On Unix-like file systems, any salvaged fragments of files are deposited into the lost+found directory, situated at the root of each file system. To scrutinize our flash drive (ensure it's unmounted initially), the following steps could be undertaken:

In my observation, file system corruption occurs infrequently unless there's an underlying hardware issue, like a failing disk drive. Typically, if the system detects file system corruption during boot-up, it halts and prompts you to execute fsck before proceeding.

What is fsck

Within Unix culture, the term fsck frequently substitutes for a well-known word, sharing three letters. This substitution is particularly fitting, considering you might find yourself using the aforementioned word if you're compelled to execute fsck.

Formatting Floppy Disks

For individuals still utilizing computers equipped with floppy diskette drives, managing these devices is feasible as well. Priming a blank floppy for operation involves a two-step procedure. Initially, we conduct a low-level format on the diskette, followed by establishing a file system. To execute the formatting, we employ the fdformat program, indicating the name of the floppy device (typically /dev/fd0):

Next, we apply a FAT file system to the diskette with mkfs:

Observe that we opt for the msdos file system type to acquire the older, more compact style of file allocation tables. Once a diskette is readied, it can be mounted akin to other devices.

Moving Data Directly To/From Devices

While our typical perception involves data on computers being organized into files, it's also plausible to perceive data in its raw state. Examining a disk drive illustrates that it comprises numerous data blocks recognized by the operating system as directories and files. Yet, if we view a disk drive purely as an extensive assembly of data blocks, various practical tasks, like device cloning, become achievable.

The dd program fulfills this function by replicating blocks of data from one location to another. Its syntax, unique for historical reasons, is often employed in the following manner:

Suppose we possessed two USB flash drives of identical sizes, intending to precisely duplicate the contents of the first drive onto the second. Upon connecting both drives to the computer, where they are assigned to devices /dev/sdb and /dev/sdc respectively, the entire content of the first drive could be replicated onto the second drive using the following command:

On the other hand, if only the initial device were connected to the computer, we could duplicate its contents into a regular file, which could later be used for restoration or copying purposes:

Warning!

Creating CD-ROM Images

Creating a recordable CD-ROM (be it a CD-R or CD-RW) involves a two-step process: firstly, crafting an ISO image file that mirrors the precise file system of the CD-ROM, and secondly, transferring the image file onto the CD-ROM media.

Creating An Image Copy Of A CD-ROM

To generate an ISO image of an existing CD-ROM, we can employ dd to extract all the data blocks from the CD-ROM and replicate them into a local file. For instance, suppose we possess an Ubuntu CD and aim to create an ISO file for future duplications. Once the CD is inserted and its device name identified (let's assume it's /dev/cdrom), we can produce the ISO file using the following procedure:

This approach is applicable for data DVDs too, but it isn't suitable for audio CDs due to their lack of a file system for storage. In the case of audio CDs, consider using the cdrdao command instead.

Creating An Image From A Collection Of Files

Generating an ISO image file that encapsulates the contents of a directory involves utilizing the genisoimage program. Initially, a directory is crafted, housing all the desired files to be encompassed in the image. Then, the genisoimage command is executed to produce the image file. For instance, assuming we've assembled a directory named ~/cd-rom-files, stocked with files intended for our CD-ROM, we could form an image file named cd-rom.iso using the following command:

The -R option incorporates metadata supporting Rock Ridge extensions, enabling the utilization of long filenames and POSIX-style file permissions. Similarly, the -J option activates the Joliet extensions, facilitating long filenames compatible with Windows.

A Program By Any Other Name...

If you explore online guides for crafting and burning optical media such as CD-ROMs and DVDs, you'll frequently encounter two programs named mkisofs and cdrecord. These programs were components of a widely-used package called cdrtools developed by Jörg Schilling. However, in the summer of 2006, Mr. Schilling made a licensing alteration to a segment of the cdrtools package that, according to many in the Linux community, led to a license discrepancy with the GNU GPL. Consequently, a divergent path for the cdrtools project emerged, resulting in replacement programs for cdrecord and mkisofs, known as wodim and genisoimage, respectively.

Writing CD-ROM Images

Once we possess an image file, the next step involves burning it onto our optical media. The majority of the commands detailed below are applicable to both recordable CD-ROM and DVD media.

Mounting An ISO Image Directly

A useful technique allows us to mount an ISO image located on our hard disk, treating it akin to optical media. Employing the -o loop option with mount (coupled with the essential -t iso9660 file system type), permits the mounting of the image file as though it were a device, integrating it into the file system tree:

In the provided instance, we established a mount point labeled /mnt/iso_image and proceeded to mount the image file image.iso onto that designated mount point. Once the image is mounted, it functions identically to a physical CD-ROM or DVD. It's crucial to remember to unmount the image when it's no longer required.

Blanking A Re-Writable CD-ROM

Before reusing rewritable CD-RW media, it's necessary to erase or blank it. This process can be accomplished using wodim by indicating the device name for the CD writer and specifying the type of blanking to be executed. The wodim program provides various blanking types, with the "fast" type being the most basic and expedient:

Writing An Image

To write an image onto optical media, wodim is utilized once more, this time specifying the optical media writer device's name and the image file's name:

Aside from the device name and image file, wodim accommodates an extensive array of options. Among these, two commonly used ones are "-v" for verbose output and -dao for disc-at-once mode, beneficial for commercial disc reproduction. It's essential to note that wodim's default mode is track-at-once, which proves advantageous for recording music tracks.

Summary

Throughout this chapter, we've delved into fundamental storage management tasks. However, the scope is considerably broader. Linux encompasses a diverse range of storage devices and file system configurations, presenting numerous functionalities for seamless interaction with other systems.

Last updated