Embedded Systems/Locks and Critical Sections

An important part of an RTOS is the lock mechanisms, and the Critical Section (CS) implementation. This section will talk about some of the problems involved in creating these mechanisms.

Basic Critical Sections
Most embedded systems have at least one data structure that is written by one task and read by another task. With a preemptive scheduler, it is all too easy to write software that *seems* to work fine most of the time, but occasionally the writer will be interrupted right in the middle of updating the data structure, the RTOS switches to the reader task, and then the reader chokes on the inconsistent data.

We need some way of arranging things so that a writer's modification appears "atomic"—a reader always sees only the (consistent) old version, or the (consistent) new version, never some partially-modified inconsistent state.

There are a variety of ways to avoid this problem, including:
 * Design the data structure so that the writer can update it in such a way that it is always in a consistent state. This requires hardware that supports atomic primitives powerful enough to update the data structure from one consistent state to the next consistent state in one atomic operation. lock-free and wait-free algorithms. For example, the read twice and compare algorithm we discuss elsewhere.
 * Have the writer turn off the task scheduler while it is updating the data structure. Then the only time the reader could possibly see the data structure, the data structure is in a consistent state.
 * Have the writer turn off all interrupts (including the timer interrupt that kicks off the task scheduler) while it is updating the data structure. Then the only time the reader could possibly see the data structure, the data structure is in a consistent state. But this makes interrupt latency much worse.
 * Use a "lock" associated with each data structure. When the reader sees that the writer is updating the data structure, have the reader tell the task scheduler to run some other process until the writer is finished. (There are many kinds of locks).
 * Use a "monitor" associated with every routine that uses a data structure.

Whenever a lock or CS mechanism is called, it is important that the RTOS disable the scheduler, to prevent the atomic operations from being preempted and executed incorrectly. Remember that embedded systems need to be stable and robust, so we cannot risk having the operating system itself being preempted while it is trying to create a lock or critical section. If we have a function called DisableScheduler, we can call that function to disable the scheduler before any atomic operation is attempted, and we can then have a function called EnableScheduler to restore the scheduler, and continue with normal operation.

Let us now create a general function for entering a critical section: and one for exiting a critical section: By disabling the scheduler during the critical section, we have guaranteed that no preemptive task-switching will occur during the critical section.

This method has a disadvantage that it slows down the system, and prevents other time-sensitive tasks from running. Next, we will show a method by which critical sections can be implemented to allow for preemption.

Critical Section Objects
Critical Sections, like any other term in computing can have a different definition than simply an operation that prevents preemption. For instance, many systems define a CS as being an object that prevents multiple tasks from entering a given section of code. Let us say that we are implementing a version of malloc on our system. We want to ensure that once a memory allocation attempt has started, that no other memory allocation attempts can begin. Only 1 memory allocation attempt can be happening at one time. However, we want to allow for the malloc function to be preempted just like every other function. To implement this, we need a new data object called a CRITICAL_SECTION, or CRIT_X, or something of that nature. Our malloc function will now look like this: If two tasks call malloc at nearly the same time, the first one will enter the critical section, while the second one will wait, or be "blocked" at the EnterCS routine. When the first malloc is finished, the second malloc's EnterCS function will return, and the function will continue.

To allow other processes looking at other data structures to continue even though this data structure has been locked, EnterCS is typically redefined something like: The value to creating a Critical Section object, and using that to prevent preemption of sensitive areas is that this scheme does not slow down the system, the way the first scheme does (by disabling the scheduler, and preventing other tasks from executing).

Some OSes, such as Dragonfly BSD, implement EnterCS and ExitCS using "serializing tokens", in such a way that when a process attempts to get a lock on another data structure, the OS briefly releases all locks that process holds, before giving that process a lock on all requested locks.