Shell Programming/Introduction

A shell script is a program run by a Unix shell, a command line interpreter. Most simply, it is a list of commands to be executed, as if they were entered at the command line, which can be invoked as a single command – it is a “script” for the shell to run. This avoids retyping, simplifies invocation, and allows the script to be used as any other program, including be itself called by other shells scripts or other programs. From the perspective of the shell, a script is a list of commands that it executes; from the perspective of another program, a script is simply a program that can be executed like any other. More formally, a shell language is a scripting language for the shell, and implicitly the host operating system: they allow one to easily invoke commands.

Beyond simply listing commands, shells typically provide programming language features such as variables and control flow constructs, and thus allow complex programs to be written as scripts. Any script can in principle be entered at the command line – there is no fundamental difference between a “script” and a sequence of shell commands – but other than short constructs (such as a for loop) typically are not.

Basic shell scripts are very easy to write: one can simply use a transcript of a shell session. However, shell scripts run very slowly compared to compiled programs (such as C programs), or even interpreted programs (such as Python scripts) that call libraries instead of separate processes, and it is difficult to write complex programs as shell scripts. They are thus primarily suited for lengthy but simple tasks that extensively call the operating system – particularly file manipulation – or other programs, particularly in system administration. More complex or performance-sensitive tasks are instead written in general-purpose languages, traditionally C, more recently Perl or Python. Due to ease of writing, shell scripts are also well-suited for one-off code or for rapid prototyping, as with other scripting languages, and provide a very good introduction both to programming generally and operating systems specifically, particularly for users familiar with a command line.

Languages
This book treats sh-compatible shells.

There are various shells, each of which behaves differently and thus these all have different programming languages. Most shells are variants of the Bourne shell (sh; see Bourne Shell Scripting‎), sharing the basic syntax but adding features, and the associated programming languages are referred to as “dialects” of sh. Today most common, particularly on Linux, is bash, the “Bourne-again shell” (see Bash Shell Scripting), which is a complex shell providing many features. The Korn shell (ksh) also sees use, primarily on proprietary Unix systems (per POSIX standard) and BSD OSes. However, other shells are used, such as the Almquist shell (ash), particularly in resource-constrained environments, for security, or for licensing – such shells typically omit interactive features, which make them smaller and more secure – and writing a script which is compatible across dialects of sh is called “portable shell scripting”. Portable shell scripting requires restricting oneself to a subset of features compatible across the target languages – this is largely equivalent to restricting oneself to the original Bourne shell language, but there are subtle edge cases to beware of.

The C shell (csh), and variants, is the primary non-sh-compatible shell. It is also scriptable (see C Shell Scripting), but this is uncommon, due to severe problems with the language. Accordingly, csh was primarily used as an interactive shell in the 1980s and 1990s, alongside some variant of sh for scripting. This has largely disappeared, due to sh-compatible shells adding the interactive features of csh. Thus for the purpose of this book, a shell means an sh-compatible shell.

Applications
Shell scripts are fundamentally scripting languages for the operating system, and thus are suited to tasks that involve invoking operating system facilities, such as file operations, or multiple other commands, with relatively little logic or advanced language facilities. By contrast, tasks that involve complicated logic or invoke few other commands – or can call a library instead of a separate command – are better written in a general-purpose language, today typically a high-level language like Python or Ruby. Earlier Perl was written for precisely this purpose – more advanced shell scripting – and retains much of the flavor and syntax of shell scripting; it was very popular from the late 1980s to early 2000s, but has since declined in popularity.

The main applications of shell scripting are either automating a repetitive task or writing a one-off script for a complicated one-time task. Shell scripting is a key skill in system administration, and is also used to script a program or combination of programs when there is a good command-line interface but no built-in scripting language or libraries.

The key advantage of shell scripts over other languages is the ease of invoking other commands, without verbose syntax (explicit function call and quoting), while the key disadvantages are limited facilities, awkward and delicate syntax (particularly due to compatibility with interactive use and due to quoting), and slow execution (due to overhead on running separate processes, particularly context switches).

Alternatives
Instead of a shell script that executes separate commands, if libraries are available one can instead write a program that calls the libraries, which avoids the separate process time overhead (and space overhead, if dynamically linked), at the cost of linking in the libraries. This incurs other costs: if statically linked this requires a time cost at compile time, space cost on disk, and space cost at run time (due to including the libraries); while if dynamically linked this incurs a time cost at run time, though dynamic linking is only required once per library.

Conversely, for very simple commands, particularly one-liners, a shell alias avoids the overhead of executing a separate script. For example, for a directory listing shortcut, the alias: is faster and simpler than the script: