Dan O'Dowd
President
Green Hills Software
The large software teams that are developing 100,000-line programs for
today's mission-critical real-time systems are caught between a rock and a
hard place. On one hand, they need a high-performance operating system
that delivers fast, predictable real-time response. But at the same time,
they need a secure real-time operating system that won't crash in the
field.
The trouble with most real-time operating systems is that they were
originally developed for 8-bit micros when a tiny footprint and fast
context switching were all that mattered. As a result, most fail to
leverage the protection facilities offered by today's memory management
units. Facilities that are indispensable to developing and executing
reliable code. Facilities that are available on most major RISC and CISC
processors today. Facilities that have been available on x86 processors
for more than a decade.
Fast, predictable real-time response will always be a must for mission
critical systems. But equally important to the large software development
teams who are working on complex projects are secure development and
operating environments that facilitate the production, execution, and
testing of reliable code. What designers need is a real-time operating
system that combines the nimble response and determinism of a real-time
microkernel with the security of a fully memory-protected RTOS.
Deeply-embedded systems can also benefit from the integrity of a
memory-protected RTOS. In the past, cost and performance constraints have
made designers of deeply embedded systems reluctant to utilize CPUs with
integrated MMUs. But dramatic gains in CPU performance, coupled with
significant improvements in integration level, have rendered traditional
cost and performance concerns virtually moot. At the same time, escalating
application complexity has shifted software productivity and reliability to
the forefront as design issues. Within five years, integrated MMUs will be
standard fare on low-cost embedded processors. RTOS vendors can do their
customers a big favor by making the move to a memory-protected design as
soon as possible.
Selecting an RTOS for mission-critical applications
When selecting an RTOS for a mission-critical embedded system, designers
should evaluate a variety of criteria, including performance,
functionality, and security. Support for pre-emptive, real-time
multitasking, of course, is a prerequisite. Partitioning sophisticated
real-time applications as a collection of tasks makes them easier to
conceptualize, prioritize, and debug. And support for priority-based
pre-emption gives designers flexible control over workload scheduling and
response to asynchronous events.
The default scheduling mechanism for most RTOSs is pre-emptive,
priority-based multitasking, with support for multiple priority
levels. With such a mechanism, tasks run until completion unless pre-empted
by a higher priority task. Tasks that have the same priority can run until
completion or execute in a round-robin fashion. Some RTOSs also offer
variations on this theme. Integrity, for example, supports a unique
scheduling mechanism that lets programmers specify guaranteed run times for
critical tasks within a given priority group.
Subroutine library vs trap interface
Beyond baseline multitasking facilities, designers should evaluate the
RTOS' underlying architecture and how it interacts with user
applications. Most real-time operating systems are implemented as libraries
of C-callable subroutines. The subroutine call mechanism provides an
efficient means of invoking kernel services. At the same time, the library
implementation provides a scalable, modular environment that enables
designers to optimize speed and size by selecting only those services that
they need for their target system. The user application, together with
selected RTOS services and boot code, are all linked and downloaded to the
target as a single image.
The RTOSs used in mission-critical systems take a different tack,
utilizing a trap mechanism that maximizes security by enabling the kernel
to enforce strict access permissions and provide extensive error
checking. The trap mechanism decouples user processes from kernel-mode
operations, erecting a firewall that prevents errant or malicious user
processes from accessing and corrupting critical kernel data structures.
To maximize reliability and security, the Integrity kernel maintains a list
of objects and permissions for each user process. When a process requests
an RTOS service or access to a kernel object, it supplies a simple integer
similar to a Unix descriptor, which the RTOS translates into a physical
address for that object. This translation mechanism prevents errant
processes from accessing kernel objects directly and crashing the kernel
(i.e., by providing a wrong or uninitialized address).
Kernel re-entrancy compromises security
One of the most frequented touted features of today's embedded real-time
operating systems is kernel re-entrancy. A re-entrant kernel is one in
which multiple kernel services may be invoked by multiple user processes
concurrently. Re-entrancy enables kernel services to be pre-empted, which
helps compensate for long kernel service run times and reduce kernel
latencies.
The problem with re-entrancy is that it gives multiple kernel-mode
processes concurrent access to critical kernel data structures. These
structures can be protected through mutual-exclusion mechanisms such as
semaphores. But these mechanisms are complex and error prone. They also
degrade overall kernel performance.
A better approach is to shorten the run times of kernel services,
thereby minimizing kernel latency without resorting to a re-entrant
design. Ideally, all services must complete and exit the kernel before
another service runs. This ensures that multiple processes can never have
concurrent access to critical kernel data structures.
Performance considerations
When evaluating RTOS performance, the two most often quoted metrics are
context switching and interrupt latency. Context switching refers to the
time required to pre-empt the current task, store its context, and begin
processing the next task. Most RTOSs provide context switching speeds in
the tens of microseconds, depending on the processor. Interrupt latency
refers to the worst-case time required for a processor to execute the first
line of code in an interrupt service routine (ISR) in response to a
hardware interrupt. Worst-case interrupt latencies for the fastest RTOSs
and processors are often less than 10 microseconds.
High-speed context switching is important for real-time applications
because it minimizes the overhead associated with workload
scheduling. However, this metric is often overrated, given that properly
designed applications spend the bulk of their time executing user code, not
switching between tasks.
More important for mission critical real-time systems is interrupt
latency, which impacts the system's ability to respond quickly and
predictably to asynchronous events (interrupts). Most RTOSs support a
two-tier interrupt response model based on interrupt service routines
(ISRs) and background RTOS tasks. ISRs are invoked automatically by
hardware interrupts and execute independent of the kernel. They typically
run for a brief time, performing time-critical front-line chores such as
moving characters into a buffer. The bulk of the processing is handled by
less critical background tasks, which are scheduled by the kernel according
to priority level.
Minimizing worst-case interrupt latency
A system's worst-case interrupt response time is dictated by three
parameters: The time it takes the CPU to recognize a hardware interrupt and
vector to the ISR; The time it takes the slowest ISR to finish executing
after an interrupt has been recognized; and the longest amount of time that
the kernel disables interrupts.
The speed with which a system can recognize an interrupt and vector to
an ISR is typically bounded by the run time of the CPU's slowest
instruction (i.e, the instruction that is being executed when the interrupt
is received). Most CPUs typically disable interrupts (or elevate their
priority) while executing an instruction in order to assure atomic
operation. As a result, interrupts can't be recognized until the currently
executed instruction is completed.
On most CPUs, multiply and divide instructions have the longest
latencies. This latency is beyond the control of the real-time operating
system. However, RTOS designers can help mitigate the effect of slow
instruction run times by omitting multiply and divide instructions from
their kernel code. Programmers who want to do the same at a system level
will need to eliminate these instructions from their application code and
ISRs.
The second factor that determines interrupt latency is the time that it
takes the currently executing ISR to complete once an interrupt is
received. Here again, it is common practice to suspend interrupts or
elevate interrupt priority until the current ISR has finished
executing. Unfortunately, ISR run times are also beyond the control of the
RTOS. As a result, it is incumbent upon programmers to keep ISR run times
short by using ISRs to perform only time-critical tasks and avoiding the
use of slow instructions such as multiply and divide.
The only parameter influencing interrupt latency that is under the
direct control of the kernel is the time that the kernel disables
interrupts. Virtually all real-time kernels need to disable interrupts for
brief periods of time in order to operate on critical kernel data
structures (such as a linked list of ready- to-run tasks). RTOSs like
Integrity that target mission-critical applications should never disables
interrupts, leaving interrupt latency entirely in the hands of the hardware
designer and application programmer.
Interprocess communications
Real-time operating systems support a variety of mechanisms for
supporting interprocess communications. Two of the most popular are
semaphores and pipes. Semaphores provide the most efficient means of
coordinating communications between processes using shared memory. However,
they're also complicated and error-prone, leaving the details of
establishing, accessing, and protecting shared-memory regions to the
programmer.
Pipes, which leverage the protection facilities of hardware MMUs,
provide a more secure paradigm for conducting interprocess
communications. They're also simpler and more intuitive. To request a pipe
service, processes simply designate a receiver and tell the kernel how much
data they want to send. The kernel takes care of the rest, verifying access
permissions, locating the necessary buffers, and performing the
transfer. In this way, processes are prevented from directly accessing
shared memory.
Because of their inherent simplicity and security, pipes should be used
as the default interprocess communications mechanism, with semaphores
engaged only for time-critical transactions. Even when developing for a CPU
with no MMU, programmers should use pipes wherever possible, thereby making
it easier to migrate the application to a fully-protected environment when
an MMU becomes available.
Memory management and I/O
Most RTOSs implement the standard C/Unix memory management model,
supporting fixed- and variable-sized memory block allocation from one or
more memory pools on a process by process basis. To maximize security
though, the RTOS should utilize the MMU to provide a firewall that affords
maximum protection for memory objects that are owned by the kernel and user
processes.
Integrity's memory management facilities, for example, let programmers
associate any memory object, including physical memory and I/O registers,
with user processes. The kernel, working in conjunction with the MMU,
enforces access permissions for those objects, thereby preventing processes
that lack the necessary permissions from deliberately or inadvertently
accessing those objects.
The ability to associate physical memory and I/O register objects with
user processes also lays the groundwork for a more secure approach to I/O
that is ideal for slow or adequately buffered devices that don't require
frequent interrupts. Typically, I/O is handled by complex, interrupt-driven
device drivers that run in kernel mode. The complexity of these drivers
makes them error prone, a problem that is exacerbated by the fact the
drivers run in kernel mode, which gives them access to critical kernel data
structures.
A simpler, more reliable approach is to substitute user processes for
device drivers, mapping the buffers and necessary kernel I/O pages into the
memory space of the user processes that are responsible for I/O. This
enables user processes to access CPU hardware resources and perform I/O
operations without having to execute in kernel mode and without having to
disable interrupts.
Keeping pace with the hardware
Maximizing real-time performance and memory efficiency is still a key
consideration for many real-time systems, especially deeply-embedded
systems equipped with underpowered CPUs and limited memory. Even so, rapid
growth in application complexity, coupled with significant gains in
hardware capability, are placing security and reliability issues on an even
footing with performance concerns.
Today's mid-range and high-end CPU already provides the hardware MMUs
needed to support advanced protection and security features. Before long,
hardware MMUs will also be available on the low-cost CPUs targeting deeply
embedded applications. RTOS vendors need to follow suit, adding protection
and security features that can take advantage of this hardware. Designers
using CPUs equipped with an MMU will be able to realize the benefits of
this approach immediately. Designers using CPUs without an MMU will be
ready with MMU-enabled applications that can be easily ported to more
capable platforms as they become available.
|