Embedded Systems/ARM Microprocessors

The ARM architecture is a widely used 32-bit RISC processor architecture. In fact, the ARM family accounts for about 75% of all 32-bit CPUs, and about 90% of all embedded 32-bit CPUs. ARM Limited licenses several popular microprocessor cores to many vendors (ARM does not sell physical microprocessors). Originally ARM stood for Advanced RISC Machines.

Some cores offered by ARM:
 * ARM7TDMI
 * ARM9
 * ARM11

Some examples of ARM based processors:


 * Intel X-Scale (PXA-255 and PXA-270), used in Palm PDAs
 * Philips LPC2000 family (ARM7TDMI-S core), LPC3000 family (ARM9 core)
 * Atmel AT91SAM7 (ARM7TDMI core)
 * ST Microelectronics STR710 (ARM7TDMI core)
 * Freescale MCIMX27 series (ARM9 core)

The lowest-cost ARM processors (in the LPC2000 series) have dropped below US$ 5 in ones, which is less than the cost of many 16-bit and 8-bit microprocessors.

Thumb calling convention
In ARM Thumb code, the 16 registers r0 - r15 typically have the same roles they have in all ARM code:


 * r0 - r3, called a1 - a4: argument/scratch/result registers.
 * r4 - r9, called v1 - v6: variables
 * r10, called sl: stack limit
 * r11, called fp: frame pointer (usually not used in Thumb code)
 * r12, called ip
 * r13, called sp: stack pointer
 * r14, called lr: link register
 * r15, called pc: the program counter

The standard C calling convention for ARM Thumb is:

Subroutine-preserved registers
When the return address is placed in pc (r15), returning from the subroutine, the sp, fp, sl, and v1-v6 registers must contain the same values they did when the subroutine was called.

The stack
Every execution environment has a limit to how low in memory the stack can grow -- the "minimum sp".

In order to give interrupts (which may occur at any time) room to work, at every instant the memory between sp and the "minimum sp" must contain nothing of value to the executing program.

Systems where the application and its library support code is responsible for detecting and handling stack overflow are called "explicit stack limit". In such systems, the sl register must always point at least 256 bytes higher address than the "minimum sp".

Caller-preserved registers
A subroutine is free to clobber a1-a4, ip, and lr.

Return values
If the subroutine returns a simple value no bigger than one word, the value must be in a1 (r0).

If the subroutine returns a simple floating-point value, the value is encoded in a1; or {a1, a2}; or {a1, a2, a3}, whichever is sufficient to hold the full precision.

A typical subroutine
The simplest entry and exit sequence for Thumb functions is:

ARM calling convention
The standard C calling convention for ARM is specified in detail by ARM PLC.

The simplest entry and exit sequence for 32-bit ARM functions is very similar to Thumb functions:

Using alternate mnemonics for the same instructions,

The BL (branch-and-link) instruction stores the return address in the link register LR (r14) and loads the program counter PC (r15) with the subroutine address. Typical subroutines (as shown above) immediately push that return address onto the stack. That frees up r14 so that the subroutine can call sub-subroutines of its own.

Subroutine-preserved registers
Typically r4-r11 are used to hold local variables of the currently-executing routine.

The registers r4-r11 are "subroutine-preserved registers" -- When the subroutine places the return address in pc (r15), returning from the subroutine, the registers r4-r11 and the stack pointer sp (r13) must contain the same values they did when the subroutine was called.

Typical subroutines (as shown above) immediately push the values of those registers onto the stack. That frees up r4-r11 to hold local variables of the currently-executing subroutine.

Optimizing ARM compilers save and restore the precise subset of r4-r11 and r14 (if any) actually modified by that subroutine, since it is a little slower (but otherwise harmless) to save and restore registers that are unused by that subroutine.

scratch registers
A subroutine is free to clobber r0-r3, r12, and the link register lr (r14).

The first four registers r0-r3 are used to pass argument values into a subroutine and to return a result value from a function.

Mixed ARM32 and Thumb calls
Normal function calls are easy with the BL instruction. A person types

and the assembler and linker will automatically Do the Right Thing -- inserting the appropriate (32-bit-long) ARM32 BL instruction for ARM32-to-ARM32 or ARM32-to-Thumb call or the appropriate (32-bit-long) Thumb BL instruction for Thumb-to-Thumb or Thumb-to-ARM32 instruction.

(Some mixed calls and some long branches require the linker to insert code that overwrites scratch register r12 with a temporary value. Exactly how the linker does that can be confusing, especially when you mix in using the BX and BLX instructions. )

For further reading

 * Embedded Systems/Assembly Language
 * Embedded_Systems/Mixed_C_and_Assembly_Programming
 * the ARM microcontroller wiki
 * "ARM Overview" at the OS Dev wiki
 * Whirlwind Tour of ARM Assembly
 * GCC ARM Improvement Project at the University of Szeged
 * The ARM Linux Project: Linux for all ARM based machines
 * ARM
 * ARM developers discussion forums
 * ARM Cortex-M3 Technical Reference Manual
 * ARM Assembler by Richard Murray