How Linux Works

# How Linux Works ![rw-book-cover](https://images-na.ssl-images-amazon.com/images/I/51i34N0H3tL._SL200_.jpg) ## Metadata - Author: [[Brian Ward]] - Full Title: How Linux Works - Category: #linux #software-architecture ## Highlights - abstraction — a fancy way of saying that you can ignore most of the details. For example, when you ride in a car, you normally don’t need to think about details such as the mounting bolts that hold the motor inside the car or the people who build and maintain the road upon which the car drives. If you’re a passenger in a car, all you really need to know is what the car does (transports you somewhere else) and a few basics about how to use it (how to operate the door and seat belt). ([Location 430](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=430)) - There are many terms for an abstracted subdivision in computer software, including subsystem, module, and package — but we’ll use the term component in this chapter because it’s simple. ([Location 440](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=440)) - A layer or level is a classification (or grouping) of a component according to where that component sits between the user and the hardware. Web browsers, games, and such sit at the top layer; at the bottom layer we have the memory in the computer hardware — the 0s and 1s. The operating system occupies most of the layers in between. ([Location 449](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=449)) - The hardware is at the base. ([Location 453](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=453)) - The next level up is the kernel, which is the core of the operating system. The kernel is software residing in memory that tells the CPU what to do. ([Location 455](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=455)) - Processes — the running programs that the kernel manages — collectively make up the system’s upper level, called user space. ([Location 457](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=457)) - There is a critical difference between the ways that the kernel and user processes run: The kernel runs in kernel mode, and the user processes run in user mode. ([Location 462](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=462)) - Of all of the hardware on a computer system, main memory is perhaps the most important. In its most raw form, main memory is just a big storage area for a bunch of 0s and 1s. Each 0 or 1 is called a bit. ([Location 476](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=476)) - You’ll often hear the term state in reference to memory, processes, the kernel, and other parts of a computer system. ([Location 480](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=480)) - The kernel is in charge of managing tasks in four general system areas: Processes. The kernel is responsible for determining which processes are allowed to use the CPU. Memory. The kernel needs to keep track of all memory — what is currently allocated to a particular process, what might be shared between processes, and what is free. Device drivers. The kernel acts as an interface between hardware (such as a disk) and processes. It’s usually the kernel’s job to operate the hardware. System calls and support. Processes normally use system calls to communicate with the kernel. ([Location 492](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=492)) - Process management describes the starting, pausing, resuming, and terminating of processes. The concepts behind starting and terminating processes are fairly straightforward, but describing how a process uses the CPU in its normal course of operation is a bit more complex. On any modern operating system, many processes run “simultaneously.” For example, you might have a web browser and a spreadsheet open on a desktop computer at the same time. However, things are not as they appear: The processes behind these applications typically do not run at exactly the same time. ([Location 506](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=506)) - The CPU (the actual hardware) interrupts the current process based on an internal timer, switches into kernel mode, and hands control back to the kernel. The kernel records the current state of the CPU and memory, which will be essential to resuming the process that was just interrupted. The kernel performs any tasks that might have come up during the preceding time slice (such as collecting data from input and output, or I/O, operations). The kernel is now ready to let another process run. The kernel analyzes the list of processes that are ready to run and chooses one. The kernel prepares the memory for this new process, and then prepares the CPU. The kernel tells the CPU how long the time slice for the new process will last. The kernel switches the CPU into user mode and hands control of the CPU to the process. ([Location 519](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=519)) - The context switch answers the important question of when the kernel runs. The answer is that it runs between process time slices during a context switch. ([Location 527](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=527)) - The kernel’s job is complicated because the following conditions must hold: The kernel must have its own private area in memory that user processes can’t access. Each user process needs its own section of memory. One user process may not access the private memory of another process. User processes can share memory. Some memory in user processes can be read-only. The system can use more memory than is physically present by using disk space as auxiliary. ([Location 534](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=534)) - Modern CPUs include a memory management unit (MMU) that enables a memory access scheme called virtual memory. ([Location 539](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=539)) - The kernel’s role with devices is pretty simple. A device is typically accessible only in kernel mode because improper access (such as a user process asking to turn off the power) could crash the machine. ([Location 550](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=550)) - There are several other kinds of kernel features available to user processes. For example, system calls (or syscalls) perform specific tasks that a user process alone cannot do well or at all. For example, the acts of opening, reading, and writing files all involve system calls. Two system calls, fork() and exec(), are important to understanding how processes start up: fork() When a process calls fork(), the kernel creates a nearly identical copy of the process. exec() When a process calls exec(program), the kernel starts program, replacing the current process. ([Location 555](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=555)) - As mentioned earlier, the main memory that the kernel allocates for user processes is called user space. Because a process is simply a state (or image) in memory, user space also refers to the memory for the entire collection of running processes. (You may also hear the more informal term userland used for user space.) ([Location 588](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=588)) - The bottom level tends to consist of small components that perform single, uncomplicated tasks. The middle level has larger components such as mail, print, and database services. Finally, components at the top level perform complicated tasks that the user often controls directly. Components also use other components. Generally, if one component wants to use another, the second component is either at the same service level or below. ([Location 600](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=600)) - The Linux kernel supports the traditional concept of a Unix user. A user is an entity that can run processes and own files. A user is associated with a username. For example, a system could have a user named billyjoe ([Location 611](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=611)) - Many important parts of the system are actually shell scripts — text files that contain a sequence of shell commands. ([Location 655](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=655)) - Once you’ve opened a shell, it should display a prompt at the top that usually ends with a dollar sign ($). On Ubuntu, that prompt should look like name@host:path$, and on Fedora, it’s [name@host path]$. ([Location 677](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=677)) - The cat command is one of the easiest Unix commands to understand; it simply outputs the contents of one or more files. ([Location 693](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=693)) - We’ll use cat to briefly explore Unix input and output (I/O). Unix processes use I/O streams to read and write data. Processes read data from input streams and write data to output streams. Streams are very flexible. For example, the source of an input stream can be a file, a device, a terminal, or even the output stream from another process. ([Location 702](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=702)) - Standard output is similar. The kernel gives each process a standard output stream where it can write its output. ([Location 717](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=717)) - The ls command lists the contents of a directory. ([Location 732](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=732)) - In its simplest form, cp copies files. ([Location 746](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=746)) - The mv (move) command is like cp. ([Location 754](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=754)) - The touch command creates a file. ([Location 762](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=762)) - To delete (remove) a file, use rm. ([Location 771](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=771)) - The echo command prints its arguments to the standard output: ([Location 774](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=774)) - The echo command is very useful for finding expansions of shell globs (“wildcards” such as *) and variables (such as $HOME), which you will encounter later in this chapter. ([Location 776](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=776)) - A path not beginning with / is called a relative path. ([Location 793](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=793)) - The current working directory is the directory that a process (such as the shell) is currently in. ([Location 797](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=797)) - The mkdir command creates a new directory dir ([Location 803](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=803)) - The rmdir command removes the directory dir: $ rmdir dir If dir isn’t empty, this command fails. However, if you’re impatient, you probably don’t want to laboriously delete all the files and subdirectories inside dir first. ([Location 807](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=807)) - You can use rm -rf dir to delete a directory and its contents, but be careful! ([Location 811](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=811)) - The -r option specifies recursive delete to repeatedly delete everything inside dir, and -f forces the delete operation. Don’t use the -rf flags with globs such as a star (*). ([Location 813](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=813)) - The shell can match simple patterns to file and directory names, a process known as globbing. This is similar to the concept of wildcards in other systems. The simplest of these is the glob character *, which tells the shell to match any number of arbitrary characters. For example, the following command prints a list of files in the current directory: $ echo * ([Location 818](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=818)) - at* expands to all filenames that start with at. *at expands to all filenames that end with at. *at* expands to all filenames that contain at. ([Location 825](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=825)) - The grep command prints the lines from a file or input stream that match an expression. ([Location 852](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=852)) - The grep command is extraordinarily handy when operating on multiple files at once because it prints the filename in addition to the matching line. For example, if you want to check every file in /etc that contains the word root, you could use this command: ([Location 855](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=855)) - Two of the most important grep options are -i (for case-insensitive matches) and -v (which inverts the search, that is, prints all lines that don’t match). There is also a more powerful variant called egrep (which is just a synonym for grep -E). ([Location 859](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=859)) - grep understands patterns known as regular expressions that are grounded in computer science theory and are very common in Unix utilities. Regular expressions are more powerful than wildcard-style patterns, and they have a different syntax. There are two important things to remember about regular expressions: .* matches any number of characters (like the * in wildcards). . matches one arbitrary character. ([Location 863](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=863)) - The less command comes in handy when a file is really big or when a command’s output is long and scrolls off the top of the screen. ([Location 875](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=875)) - The pwd (print working directory) program simply outputs the name of the current working directory. ([Location 895](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=895)) - To see the differences between two text files, use diff: $ diff file1 file2 ([Location 903](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=903)) - If you see a file and are unsure of its format, try using the file command to see if the system can guess: ([Location 909](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=909)) - It’s frustrating when you know that a certain file is in a directory tree somewhere but you just don’t know where. Run find to find file in dir: $ find dir -name file -print ([Location 914](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=914)) - The find command accepts special pattern-matching characters such as *, ([Location 920](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=920)) - To quickly view a portion of a file or stream of data, use the head and tail commands. For example, head /etc/passwd shows the first 10 lines of the password file, and tail /etc/passwd shows the last 10 lines. ([Location 929](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=929)) - The sort command quickly puts the lines of a text file in alphanumeric order. ([Location 937](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=937)) - Use the passwd command to change your password. ([Location 941](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=941)) - Change to your home directory, take a look around with ls, and then run ls -a. Do you see the difference in the output? When you run ls without the -a, you won’t see the configuration files called dot files. These are files and directories whose names begin with a dot (.). Common dot files are .bashrc and .login, and there are dot directories, too, such as .ssh. ([Location 948](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=948)) - The shell can store temporary variables, called shell variables, containing the values of text strings. Shell variables are very useful for keeping track of values in scripts, and some shell variables control the way the shell behaves. (For example, the bash shell reads the PS1 variable before displaying the prompt.) To assign a value to a shell variable, use the equal sign (=). Here’s a simple example: $ STUFF=blah ([Location 964](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=964)) - An environment variable is like a shell variable, but it’s not specific to the shell. All processes on Unix systems have environment variable storage. ([Location 973](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=973)) - PATH is a special environment variable that contains the command path (or path for short). A command path is a list of system directories that the shell searches when trying to locate a command. For example, when you run ls, the shell searches the directories listed in PATH for the ls program. If programs with the same name appear in several directories in the path, the shell runs the first matching program. If you run echo $PATH, you’ll see that the path components are separated by colons (:). For example: ([Location 984](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=984)) - Table 2-1. Special Characters Character Name(s) Uses * asterisk, star Regular expression, glob character . dot Current directory, file/hostname delimiter ! bang Negation, command history | pipe Command pipes / (forward) slash Directory delimiter, search command \ backslash Literals, macros (never directories) $ dollar Variable denotation, end of line ' tick, (single) quote Literal strings ` backtick, backquote Command substitution " double quote Semi-literal strings ^ caret Negation, beginning of line ~ tilde, squiggle Negation, directory shortcut # hash, sharp, pound Comments, preprocessor, substitutions [ ] (square) brackets Ranges { } braces, (curly) brackets Statement blocks, ranges _ underscore, under Cheap substitute for a space ([Location 1011](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1011)) - Table 2-2. Command-Line Keystrokes Keystroke Action CTRL-B Move the cursor left CTRL-F Move the cursor right CTRL-P View the previous command (or move the cursor up) CTRL-N View the next command (or move the cursor down) CTRL-A Move the cursor to the beginning of the line CTRL-E Move the cursor to the end of the line CTRL-W Erase the preceding word CTRL-U Erase from cursor to beginning of line CTRL-K Erase from cursor to end of line CTRL-Y Paste erased text (for example, from CTRL-U) ([Location 1063](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1063)) - To search for a manual page by keyword, use the -k option: ([Location 1111](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1111)) - You can select a manual page by section, which is sometimes important because man displays the first manual page that it finds when matching a particular search term. For example, to read the /etc/passwd file description (as opposed to the passwd command), you can insert the section number before the page name: $ man 5 passwd ([Location 1143](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1143)) - Some time ago, the GNU Project decided that it didn’t like manual pages very much and switched to another format called info (or texinfo). Often this documentation goes further than a typical manual page does, but it is sometimes more complex. To access an info manual, use info with the command name: $ info command ([Location 1151](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1151)) - To send the output of command to a file instead of the terminal, use the > redirection character: $ command > file The shell creates file if it does not already exist. If file exists, the shell erases (clobbers) the original file first. (Some shells have parameters that prevent clobbering. For example, enter set -C to avoid clobbering in bash.) ([Location 1161](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1161)) - You can append the output to the file instead of overwriting it with the >> redirection syntax: $ command >> file ([Location 1168](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1168)) - Occasionally, you may redirect standard output but find that the program still prints something to the terminal. This is called standard error (stderr); it’s an additional output stream for diagnostics and debugging. For example, this command produces an error: ([Location 1177](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1177)) - To channel a file to a program’s standard input, use the < operator: $ head < /proc/cpuinfo ([Location 1192](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1192)) - The program name, ls. Some programs omit this identifying information, which can be annoying when writing shell scripts, but it’s not really a big deal. The filename, /dsafsda, which is a more specific piece of information. There’s a problem with this path. The error No such file or directory indicates the problem with the filename. ([Location 1205](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1205)) - These messages pop up when you try to use a file as a directory or a directory as a file. For example: $ touch a $ touch a/b touch: a/b: Not a directory ([Location 1233](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1233)) - Recall from Chapter 1 that a process is a running program. Each process on the system has a numeric process ID (PID). For a quick listing of running processes, just run ps on the command line. You should get a list like this one: ([Location 1252](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1252)) - The fields are as follows: PID. The process ID. TTY. The terminal device where the process is running. More about this later. STAT. The process status, that is, what the process is doing and where its memory resides. For example, S means sleeping and R means running. (See the ps(1) manual page for a description of all the symbols.) TIME. The amount of CPU time in minutes and seconds that the process has used so far. In other words, the total amount of time that the process has spent running instructions on the processor. COMMAND. This one might seem obvious, but be aware that a process can change this field from its original value. ([Location 1258](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1258)) - The ps command has many options. To make things more confusing, you can specify options in three different styles — Unix, BSD, and GNU. Many people find the BSD style to be the most comfortable (perhaps because it involves less typing), so we’ll use the BSD style in this book. Here are some of the most useful option combinations: ps x Show all of your running processes. ps ax Show all processes on the system, not just the ones you own. ps u Include more detailed information on processes. ps w Show full command names, not just what fits on one line. ([Location 1268](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1268)) - To terminate a process, send it a signal with the kill command. A signal is a message to a process from the kernel. When you run kill, you’re asking the kernel to send a signal to another process. In most cases, all you need to do is this: $ kill pid ([Location 1287](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1287)) - Shells also support job control, which is a way to send TSTP (similar to STOP) and CONT signals to programs by using various keystrokes and commands. ([Location 1310](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1310)) - Normally, when you run a Unix command from the shell, you don’t get the shell prompt back until the program finishes executing. However, you can detach a process from the shell and put it in the “background” with the ampersand (&); this gives you the prompt back. For example, if you have a large file that you need to decompress with gunzip (you’ll see this in 2.18 Archiving and Compressing Files), and you want to do some other stuff while it’s running, run a command like this one: $ gunzip file.gz & ([Location 1323](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1323)) - The shell should respond by printing the PID of the new background process, and the prompt should return immediately so that you can continue working. ([Location 1330](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1330)) - Every Unix file has a set of permissions that determine whether you can read, write, or run the file. Running ls -l displays the permissions. ([Location 1345](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1345)) - The first character of the mode is the file type. A dash (-) in this position, as in the example, denotes a regular file, meaning that there’s nothing special about the file. ([Location 1350](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1350)) - Each permission set can contain four basic representations: r Means that the file is readable. w Means that the file is writable. x Means that the file is executable (you can run it as a program). - Means nothing. ([Location 1359](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1359)) - To change permissions, use the chmod command. First, pick the set of permissions that you want to change, and then pick the bit to change. For example, to add group (g) and world (o, for “other”) read (r) permissions to file, you could run these two commands: $ chmod g+r file $ chmod o+r file ([Location 1383](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1383)) - Or you could do it all in one shot: $ chmod go+r file ([Location 1388](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1388)) - You may sometimes see people changing permissions with numbers, for example: $ chmod 644 file ([Location 1395](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1395)) - This is called an absolute change because it sets all permission bits at once. To understand how this works, you need to know how to represent the permission bits in octal form (each numeral represents a number in base 8 and corresponds to a permission set). See the chmod(1) manual page or info manual for more. ([Location 1397](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1397)) - Table 2-4. Absolute Permission Modes Mode Meaning Used For 644 user: read/write; group, other: read files 600 user: read/write; group, other: none files 755 user: read/write/execute; group, other: read/execute directories, programs 700 user: read/write/execute; group, other: none directories, programs 711 user: read/write/execute; group, other: execute directories ([Location 1402](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1402)) - A symbolic link is a file that points to another file or a directory, effectively creating an alias (like a shortcut in Windows). Symbolic links offer quick access to obscure directory paths. In a long directory listing, symbolic links look like this (notice the l as the file type in the file mode): lrwxrwxrwx 1 ruser users 11 Feb 27 13:52 somedir -> /home/origdir ([Location 1426](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1426)) - To create a symbolic link from target to linkname, use ln -s: $ ln -s target linkname The linkname argument is the name of the symbolic link, the target argument is the path of the file or directory that the link points to, and the -s flag specifies a symbolic link (see the warning that follows). ([Location 1441](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1441)) - The program gzip (GNU Zip) is one of the current standard Unix compression programs. ([Location 1467](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1467)) - Unlike the zip programs for other operating systems, gzip does not create archives of files; that is, it doesn’t pack multiple files and directories into one file. To create an archive, use tar instead: $ tar cvf archive.tar file1 file2 ... ([Location 1472](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1472)) - To unpack a .tar file with tar use the x flag: $ tar xvf archive.tar In this command, the x flag puts tar into extract (unpack) mode. ([Location 1490](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1490)) - Before unpacking, it’s usually a good idea to check the contents of a .tar file with the table-of-contents mode by using the t flag instead of the x flag. ([Location 1501](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1501)) - Many beginners find it confusing that archives are normally found compressed, with filenames ending in .tar.gz. To unpack a compressed archive, work from the right side to the left; get rid of the .gz first and then worry about the .tar. For example, these two commands decompress and unpack <file>.tar.gz: $ gunzip file.tar.gz $ tar xvf file.tar ([Location 1517](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1517)) - The method shown above isn’t the fastest or most efficient way to invoke tar on a compressed archive, and it wastes disk space and kernel I/O time. A better way is to combine archival and compression functions with a pipeline. For example, this command pipeline unpacks <file>.tar.gz: $ zcat file.tar.gz | tar xvf - The zcat command is the same as gunzip -dc. The -d option decompresses and the -c option sends the result to standard output (in this case, to the tar command). ([Location 1528](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1528)) - Another compression program in Unix is bzip2, whose compressed files end with .bz2. While marginally slower than gzip, bzip2 often compacts text files a little more, and it is therefore increasingly popular in the distribution of source code. ([Location 1546](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1546)) - Now that you know how to examine files, change directories, and read manual pages, you’re ready to start exploring your system files. The details of the Linux directory structure are outlined in the Filesystem Hierarchy Standard, or FHS (http://www.pathname.com/fhs/), but a brief walkthrough should suffice for now. ([Location 1561](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1561)) - /bin Contains ready-to-run programs (also known as an executables), including most of the basic Unix commands such as ls and cp. Most of the programs in /bin are in binary format, having been created by a C compiler, but some are shell scripts in modern systems. /dev Contains device files. You’ll learn more about these in Chapter 3. /etc This core system configuration directory (pronounced EHT-see) contains the user password, boot, device, networking, and other setup files. Many items in /etc are specific to the machine’s hardware. For example, the /etc/X11 directory contains graphics card and window system configurations. /home Holds personal directories for regular users. Most Unix installations conform to this standard. /lib An abbreviation for library, this directory holds library files containing code that executables can use. There are two types of libraries: static and shared. The /lib directory should contain only shared libraries, but other lib directories, such as /usr/lib, contain both varieties as well as other auxiliary files. (We’ll discuss shared libraries in more detail in Chapter 15.) /proc Provides system statistics through a browsable directory-and-file interface. Much of the /proc subdirectory structure on Linux is unique, but many other Unix variants have similar features. The /proc directory contains information about currently running processes as well as some kernel parameters. /sys This directory is similar to /proc in that it provides a device and system interface. You’ll read more about /sys in Chapter 3. /sbin The place for system executables. Programs in /sbin directories relate to system management, so regular users usually do not have /sbin components in their command paths. Many of the utilities found here will not work if you’re not running them as root. /tmp A storage area for smaller, temporary files that you don’t care much about. Any user may read to and write from /tmp, but the user may not have permission to access another user’s files there. Many programs use this directory as a workspace. If something is extremely important, don’t put it in /tmp because most distributions clear /tmp when the machine boots and some even remove its old files periodically. Also, don’t let /tmp fill up with garbage because its space is usually shared with something critical (like the rest of /, for example). /usr Although pronounced “user,” this subdirectory has no user files. Instead, it contains a large directory hierarchy, including the bulk of the Linux system. Many of the directory names in /usr are the same as those ([Location 1570](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1570)) - in the root directory (like /usr/bin and /usr/lib), and they hold the same type of files. (The reason that the root directory does not contain the complete system is primarily historic — in the past, it was to keep space requirements low for the root.) /var The variable subdirectory, where programs record runtime information. System logging, user tracking, caches, and other files that system programs create and manage are here. (You’ll notice a /var/tmp directory here, but the system doesn’t wipe it on boot.) ([Location 1595](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1595)) - /boot Contains kernel boot loader files. These files pertain only to the very first stage of the Linux startup procedure; you won’t find information about how Linux starts up its services in this directory. See Chapter 5 for more about this. /media A base attachment point for removable media such as flash drives that is found in many… ([Location 1602](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1602)) - /include Holds header files used by the C compiler. /info Contains GNU info manuals (see 2.13 Getting Online Help). /local Is where administrators can install their own software. Its structure should look like that of / and /usr. /man Contains manual pages. /share Contains files that should work on other kinds of Unix machines with no loss of functionality. In the past, networks of machines would share this directory, but a true /share directory is becoming rare because there are no space issues on modern disks. Maintaining a /share directory is often just a pain. In any case, /man, /info, and some other subdirectories are often found here. ([Location 1612](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1612)) - Most larger distributions use a package called sudo to allow administrators to run commands as root when they are logged in as themselves. For example, in Chapter 7, you’ll learn about using vipw to edit the /etc/passwd file. You could do it like this: $ sudo vipw When you run this command, sudo logs this action with the syslog service under the local2 facility. You’ll also learn more about system logs in Chapter 7. ([Location 1634](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1634)) - The first line defines an ADMINS user alias with the two users, and the second line grants the privileges. The ALL = NOPASSWD: ALL part means that the users in the ADMINS alias can use sudo to execute commands as root. The second ALL means “any command.” ([Location 1647](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1647)) - You should now know how to do the following at the command line: run programs, redirect output, interact with files and directories, view process listings, view manual pages, and generally make your way around the user space of a Linux system. ([Location 1663](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1663)) - It is easy to manipulate most devices on a Unix system because the kernel presents many of the device I/O interfaces to user processes as files. These device files are sometimes called device nodes. Not only can a programmer use regular file operations to work with a device, but some devices are also accessible to standard programs like cat, so you don’t have to be a programmer to use a device. ([Location 1679](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1679)) - Note the first character of each line (the first character of the file’s mode) in Example 3-1. If this character is b, c, p, or s, the file is a device. These letters stand for block, character, pipe, and socket, respectively, as described in more detail below. ([Location 1695](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1695)) - Block device Programs access data from a block device in fixed chunks. The sda1 in the preceding example is a disk device, a type of block device. Disks can be easily split up into blocks of data. Because a block device’s total size is fixed and easy to index, processes have random access to any block in the device with the help of the kernel. ([Location 1699](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1699)) - Character devices work with data streams. You can only read characters from or write characters to character devices, as previously demonstrated with /dev/null. Character devices don’t have a size; when you read from or write to one, the kernel usually performs a read or write operation on the device. Printers directly attached to your computer are represented by character devices. It’s important to note that during character device interaction, the kernel cannot back up and reexamine the data stream after it has passed data to a device or process. Pipe device Named pipes are like character devices, with another process at the other end of the I/O stream instead of a kernel driver. Socket device Sockets are special-purpose interfaces that are frequently used for interprocess communication. They’re often found outside of the /dev directory. Socket files represent Unix domain sockets; you’ll learn more about those in Chapter 10. ([Location 1704](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1704)) - The traditional Unix /dev directory is a convenient way for user processes to reference and interface with devices supported by the kernel, but it’s also a very simplistic scheme. The name of the device in /dev tells you a little about the device, but not a lot. Another problem is that the kernel assigns devices in the order in which they are found, so a device may have a different name between reboots. ([Location 1723](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1723)) - The program dd is extremely useful when working with block and character devices. This program’s sole function is to read from an input file or stream and write to an output file or stream, possibly doing some encoding conversion on the way. ([Location 1776](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1776)) - dd copies data in blocks of a fixed size. Here’s how to use dd with a character device and some common options: ([Location 1779](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1779)) - As you can see, the dd option format differs from the option formats of most other Unix commands; it’s based on an old IBM Job Control Language (JCL) style. Rather than use the dash (-) character to signal an option, you name an option and set its value to something with the equals (=) sign. The preceding example copies a single 1024-byte block from /dev/zero (a continuous stream of zero bytes) to new_file ([Location 1782](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1782)) - It can sometimes be difficult to find the name of a device (for example, when partitioning a disk). Here are a few ways to find out what it is: Query udevd using udevadm (see 3.5 udev). Look for the device in the /sys directory. Guess the name from the output of the dmesg command (which prints the last few kernel messages) or the kernel system log file (see 7.2 System Logging). This output might contain a description of the devices on your system. ([Location 1809](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1809)) - For a disk device that is already visible to the system, you can check the output of the mount command. Run cat /proc/devices to see the block and character devices for which your system currently has drivers. Each line consists of a number and name. The number is the major number of the device as described in 3.1 Device Files. If you can guess the device from the name, look in /dev for the character or block devices with the corresponding major number, and you’ve found the device files. ([Location 1815](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1815)) - Most hard disks attached to current Linux systems correspond to device names with an sd prefix, such as /dev/sda, /dev/sdb, and so on. These devices represent entire disks; the kernel makes separate device files, such as /dev/sda1 and /dev/sda2, for the partitions on a disk. ([Location 1824](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1824)) - The naming convention requires a little explanation. The sd portion of the name stands for SCSI disk. Small Computer System Interface (SCSI) was originally developed as a hardware and protocol standard for communication between devices such as disks and other peripherals. ([Location 1828](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1828)) - The Linux block devices /dev/hda, /dev/hdb, /dev/hdc, and /dev/hdd are common on older versions of the Linux kernel and with older hardware. ([Location 1858](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1858)) - Terminals are devices for moving characters between a user process and an I/O device, usually for text output to a terminal screen. The terminal device interface goes back a long way, to the days when terminals were typewriter-based devices. Pseudoterminal devices are emulated terminals that understand the I/O features of real terminals. But rather than talk to a real piece of hardware, the kernel presents the I/O interface to a piece of software, such as the shell terminal window that you probably type most of your commands into. Two common terminal devices are /dev/tty1 (the first virtual console) and /dev/pts/0 (the first pseudoterminal device). The /dev/pts directory itself is a dedicated filesystem. ([Location 1864](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1864)) - Linux has two primary display modes: text mode and an X Window System server (graphics mode, usually via a display manager). Although Linux systems traditionally booted in text mode, most distributions now use kernel parameters and interim graphical display mechanisms (bootsplashes such as plymouth) to completely hide text mode as the system is booting. In such cases, the system switches over to full graphics mode near the end of the boot process. ([Location 1874](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1874)) - Older RS-232 type and similar serial ports are special terminal devices. ([Location 1895](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1895)) - Representing an interface type that has largely been replaced by USB, the unidirectional parallel port devices /dev/lp0 and /dev/lp1 correspond to LPT1: and LPT2: in Windows. ([Location 1901](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1901)) - In modern Linux systems, you do not create your own device files; this is done with devtmpfs and udev (see 3.5 udev). ([Location 1921](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1921)) - The mknod command creates one device. You must know the device name as well as its major and minor numbers. For example, creating /dev/sda1 is a matter of using the following command: ([Location 1923](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1923)) - We’ve already talked about how unnecessary complexity in the kernel is dangerous because you can too easily introduce system instability. ([Location 1939](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1939)) - The devtmpfs filesystem was developed in response to the problem of device availability during boot (see 4.2 Filesystems for more details on filesystems). This filesystem is similar to the older devfs support, but it’s simplified. ([Location 1948](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1948)) - The udevd daemon operates as follows: The kernel sends udevd a notification event, called a uevent, through an internal network link. udevd loads all of the attributes in the uevent. udevd parses its rules, and it takes actions or sets more attributes based on those rules. ([Location 1965](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=1965)) - The SATA hard disk and optical drive shown in Figure 3-2 both use the same SATA interface. To connect the SATA-specific drivers of the kernel to the SCSI subsystem, the kernel employs a bridge driver, as with the USB drives, but with a different mechanism and additional complications. The optical drive speaks ATAPI, a version of SCSI commands encoded in the ATA protocol. However, the hard disk does not use ATAPI and does not encode any SCSI commands! ([Location 2121](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=2121)) - When a user-space process communicates with the SCSI subsystem, it normally does so through the block device layer and/or another other kernel service that sits on top of an SCSI device class driver (like sd or sr). In other words, most user processes never need to know anything about SCSI devices or their commands. ([Location 2135](https://readwise.io/to_kindle?action=open&asin=B00PKTGLWM&location=2135))