Operating System Fundamentals

What is an Operating System
Operating systems are typically segregated into kernel and userland.

The Kernel provides a layer for the software to interact with the hardware. It abstracts the hardware allowing a lot of software to run identically on very different hardware. The kernel provides system calls to allow the userland to interact with it. The kernel handles many things including filesystems (not always but typically), devices, and control of processes.

The userland exists as everything else other than the kernel. All processes created by the user including the terminal exist in userland. The Graphical User Interface (GUI) that displays programs lives in userland.

The Unix Shell
The Unix shell is a command interpreter program that serves as the primary interface between users and the OS in a command line environment, e.g. a terminal or terminal simulator. A shell is an essential (often preferred) tool, but it is just a ordinary user program that uses system calls to get most of the work done - so it's just a "shell".

Popular Shells
Many shells exist in the modern era, each with their own set of features. The most common is the bourne shell. The bourne shell (known colloquially by its POSIX location as /bin/sh) has been around for many decades now and may be found on essentially any Unix computer. While it lacks certain interactivity features, it's so commonplace that any script written for it will run on essentially any Unix system.

Functions of a Shell
A shell's primary duty is to present its users a command prompt (e.g. $), wait for a command, and execute the command.

A shell may also be used for writing programs via writing shell commands into a text file. One must include an interpreter at top of the file in the form #!interpreter (ie: #!/bin/sh). When executing said file, Unix reads interpreter and thus knows to use that shell to interpret all of the commands.

Creating a Shell
The overall structure of a shell can be:

repeat forever read one line parse the command into a list of arguments if the line starts with a command name (e.g. cd and exit) then perform the function (if it's exit, break out of the loop) else (it invokes a program, e.g. ls and cat) execute the program

To read a command we read one line at a time and tokenize the line into tokens.

To execute a program the following needs to take place: Use fork system call to duplicate the current process:

use fork to clone a process
This example shows how to create a new process by forking the current process. Note that the fork function call (system call) is invoked once but return twice because when the call finishes there are two processes executing the same code.

use execvp to run a new program in the foreground
Note that the argument list must contain the program name as the first argument and MUST end with a NULL, which indicates the end of the list. Otherwise, execvp won't function correctly.

use dup2 to redirect standard output
This example opens a file ("test.txt") for write and makes the standard output synonymous to the open file - bytes sent to the standard output will go to the file. The "O_WRONLY | O_CREAT" flags cause the file to be opened for write and the file to be created if it doesn't already exist.

Information about each open file is recorded in a table managed by the OS. Each entry corresponds to an open file and the index of each entry is an integer (file descriptor), which is returned to the process opening the file as the return value of the open system call. The entries for standard input, standard output, and standard error are reserved with predefined indices: STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO (defined in ). The dup2(index1, index2) function will copy the entry content at index1 to the entry at index2, which make the file descriptor index2 synonymous to the file descriptor index1.

redirect standard output in a child process
The "dup2 (fd, STDOUT_FILENO)" line (in the previous example) redirects the standard output of the current (main) process to the open file. If you want to redirect the standard output of a child process to a file, you will need to wait till the child process is created - after the fork function call. The following example demonstrate the idea. You will see that the "hello" message from the parent process still goes to the standard output, but the standard output from the child process gets redirected to the file.

An example pipe connecting two commands
In this example, the original list of arguments is broken into two list at the pipe symbol, which is replaced by a null value. This allows us to use the two argument lists to run the two separate commands/programs connected by the pipe.

A process is a program in execution. It is a metaphor for a entity managed by the OS. A process has its own address space and other information in OS managed data structures.

Input and Output


Input and Output are perhaps the most prevalent concept in Unix. In Unix everything is a file, and this was accomplished on purpose so that programs may interact with different devices in a common manner. A file may be given as the input of a program, or a file may be created from the output of a program.

Every process has a set of input and output streams. While processes may have as many as the developer desires, all possess at least 3. These streams are known as standard input, standard output, and standard error. Many programs use these simple streams so they may be easily manipulated by user. Take the programs cat and grep for example. cat sends a given file into standard output (typically your terminal unless otherwise specified), and grep searches for a pattern in its standard input. If one enters the command $cat file.txt | grep "hi" this will search the file for the given text hi.

File System Concepts
file abstraction - a byte stream, meaning is imposed by file system users file system users - end users (humans) and direct users (programs, e.g. an application or the shell) user perspective - a collection of system calls, such as creat, open, close, seek, delete ... file attributes - meta data: owner, size, timestamps ... directory abstraction - a list of files and directory, map the name of a file/directory onto the information needed to locate the data. absolute path and relative path directory operations - create, delete, open, close, rename, link, unlink

File System Implementations
layout: partition, boot block, superblock, ... disk block allocation: contiguous allocation linked list allocation: the first word in each block is used as a pointer to the next one. file allocation table: linked list allocation using a table in memory. i-node (index-node): an i-node is in memory when the corresponding file is open. With a i-node design we can calculate the largest possible file size. directory implementation: i-node, long file names file sharing: symbolic v.s. hard link disk block size: tradeoff and compromise, wasting disk space v.s. performance (data rate) keeping track of free blocks: linked list v.s. bitmap system backup: caching: steps in looking up a file under a path

Resources

 * Introduction to C Programming for Java Programmers
 * A Comprehensive Guide to C Programming
 * Advanced Linux Programming
 * Variadic Functions in C
 * LPI Level 1 Guide