Advanced RTOS, embedded real-time OS, compact OS, IDE, Software development toolkits, embedded c compilers, Multicore debugger, hardware probes, static source code analysis tool, secure hypervisor, virtual prototyping platform

Embedded C++ Slashes Code Size And Boosts Execution


New Specification Lets Developers Take Advantage Of The Object-Oriented Features Of C++ , And Avoid Some Of Its Pitfalls.


Mike Haden, Green Hills Software, 30 W. Sola St., Santa Barbara, CA 93101; (805) 965-6044; fax (805) 965-6343; e-mail: mike@ghs.com or craig@ghs.com. Embedded system programmers have by and large remained loyal to the C language, while their counterparts working on desktop applications have enjoyed the object-oriented features of the C++ language. However, the robust C++ language does have one major pitfall: Using it results in code that's simply too bloated for memory-sensitive embedded applications.

To address the shortcomings of C++ for embedded applications, an industry group led by major Japanese CPU manufacturers, including NEC, Hitachi, Fujitsu, and Toshiba, has set out to define a new dialect of C++ called EC++. The goal of the effort is to preserve the most useful object-oriented features of the C++ language yet minimize code size while maximizing execution efficiency.

The EC++ Technical Committee has already produced a draft specification for the new dialect targeted squarely at the 32-bit processors widely used in embedded applications. The committee also is working on a style guide that will help embedded systems programmers maximize the efficiency, readability, reliability, and maintainability of their EC++ programs. Development of validation suites for a broad range of processors, including MIPS, PowerPC, 68k families, Coldfire, Hitachi's SH family, and NECs' V800 also is underway.


EC++ — A Few Simple Reasons

Some C++ features such as exception handling and multiple inheritance add code-size and performance overhead, even when they are not used in a program. Features such as namespaces, mutable specifiers, and new-style casts are difficult to understand, increasing the chances of programmer errors. Moreover, these features are overkill for many embedded applications. The use of templates often results in code bloat and programmers struggle to accurately judge the amount of code that will be generated when using templates. C++ library functions including standard I/O functions result in large amounts of code being added to a compiled program, even though most of the code is never executed.

Before examining the features that have been omitted, programmers should consider the advantages of the object-oriented features that remain in EC++. Object classes have proven to be the most useful concept of C++. These classes allow programmers to partition code so that housekeeping functions like memory allocation and the initialization of data structures are separated from the main part of an application. The ability to separate these functions results in more readable code. The class definitions create objects that can be reused throughout a program.

For example, a programmer can use classes to describe objects ranging from arrays to geometric shapes to animals. A class called BOX can be defined with attributes such as height, width, depth, and even color or weight. When created, each instance of a BOX object automatically includes all class attributes or data members. The class definition includes the necessary object constructors, destructors, memory allocation code, and functions that are specific to the BOX class-for example, a function that returns the volume of a box.

Both EC++ and C++ also let programmers redefine standard operators-a technique called operator overloading-in such a way that they have meaning relative to a specific object type. For example, a programmer could redefine the "=" operator for use with two BOX objects. The programmer could specify that box A = box B when the two have equal dimensions. Or for a given application, the programmer may require that the boxes must share the same weight or color before they can be judged to be equal. The programmer can harness this flexibility in the class definition and apply it to any valid EC++ operator, including "+,-,*,/, >, <" and others. In the main code, a simple expression such as "BOX_A > BOX_B" is unambiguous and returns precisely the result desired by the programmer.


C++ Omissions in EC++

EC++ actually retains much of the value found in C++, and discussion of the omitted features will both illustrate and detail other valuable features and limits imposed in EC++. The list of omitted features is short:
  1. Multiple inheritance and virtual base classes
  2. New-style casts
  3. Mutable specifiers
  4. Namespaces
  5. Run-time type identification
  6. Exceptions
  7. Templates
In reality, of course, classes are a significantly more complex concept than our BOX example indicates. Multiple inheritance and virtual base classes fall at the complex end of the robust class concept implemented in C++. Programmers often use a hierarchical structure to define complex classes. For example, a base class called SHAPE might define any geometrical shape and could have an attribute such as color. Classes such as CIRCLE or SQUARE could be derived from shape with additional attributes including radius or width (Fig. 1). An object declared based on a derived class automatically includes all of the data members, functions, and other attributes defined in the base and derived classes.

EC++ and C++ both allow programmers to build multilevel class hierarchies in a linear fashion. C++ also allows multiple inheritance in which the programmer defines a new class based on two or more peer classes. For example, a programmer could define one class called BOX and a second class called CONTENTS. A new class called SHIPMENT could then be derived from BOX and CONTENTS. An object declared using the SHIPMENT class would inherit the data members, functions, and other attributes of both BOX and CONTENTS (Figs. 2 and 3--if .pdf format).

Multiple inheritance can be valuable in a number of applications and is regularly used in graphical desktop environments such as Microsoft Windows. In Windows, for example, a useful object class can be derived based on a WINDOW class with display attributes such as size and borders; a MENU class with attributes such as menu names and styles; and a DISPLAY class with attributes that describe objects displayed in a window. Virtual base classes can be used along with multiple inheritance to share a base class which is inherited multiple times in a deviation hierarchy.

Some embedded applications could make use of multiple inheritance and virtual base classes but it's not nearly so useful a tool as in desktop applications. Supporting multiple inheritance in a compiler can carry a significant burden and the technique is tricky to use correctly. Multiple inheritance can result in multiple base classes that include data members or functions with identical names and therefore must be correctly identified by the programmer with the C++ resolution-scope operator. Moreover, real-world applications like embedded systems lend themselves far more often to a linear class hierarchy.


Explicit-Type Conversion

While multiple inheritance may be difficult to use, other C++ features omitted from EC++ simply aren't used very often in any application. Despite such infrequent use, support for these features in C++ libraries results in bloated code, and it's usually better just to do without them. One good example is the dynamic_cast feature that was added as part of the new-style casts.

The C and C++ languages support the concept of "casting" to convert from one data type to another. For example, an integer must be converted to a floating-point number before it can be added to another floating-point number. Programmers also can use the casting concept to convert pointer types. In general, programmers should strive to minimize the need for casting because it's often indicative of a poorly structured program. EC++ and C++ class structures already reduce the need for casting relative to typical C programs.

In C, and therefore C++, since it's a superset of C, an expression such as (double)X will cast the value in X as a floating-point number, even if X had originally been defined as integer or short. C++ includes new more-explicit cast operators including 'static_cast,' 'reinterpret_cast,' 'const_cast,' and 'dynamic_cast.' With the exception of the dynamic case, these new-style casts don't so much add functionality to traditional C casts, but rather make the intentions of the program author more explicit. The features, however, haven't garnered much popularity in the programmer community. In fact, most popular C++ books don't even mention the new-style casts. Once books are revised to the current ANSI C++ level, they will probably discuss the new-style casts and possibly lead toward wider usage.

Mutable specifiers also represent an arcane feature of C++ that is essentially a special case of explicit type conversion. Programmers can use the keyword 'mutable' to cast a data member of a class in a way that it can be modified even though the class is logically constant. Because new-style casts and mutable specifiers are rarely used, they are not justifiable for EC++, especially in light of the code overhead required to support dynamic_cast and the complexity of using mutable specifiers correctly.


Namespaces, Run-Time-Type ID

Some other C++ features including namespaces and run-time-type identification (the mechanism used by dynamic_cast and other features) are primarily useful in extremely large applications. More specifically, these features are useful in projects where many programmers working on a single code base. They also come in handy when a programmer must interface an application with multiple libraries and code modules from different sources.

-Namespaces give the programmer a way to avoid name conflicts. When several programmers work together, or when a programmer uses libraries from a third party, there is a good chance that they will run into duplicate function, class, or variable names. The programmer can declare a namespace and essentially contain a code module within that namespace. Within the module, the programmer can access any variable or function normally. Code outside the namespace, however, must use the name of the namespace with the resolution-scope operator to refer to a variable or function within the namespace.

In most embedded applications, the programmer or programmers will be dealing with a much smaller code universe-especially relative to desktop environments such as Windows. Experiment with Microsoft Visual C++ for Windows, and you'll discover thousands of arcane functions that have been developed over a decade and remain in libraries so that legacy applications will still work in the latest Windows release. Moreover, the typical Windows application is partitioned into dozens of header and source files. Namespaces can be critical to successful Windows programming.

Namespaces, however, are difficult to use correctly. Programmers can measure namespaces inappropriately and simply add complexity to code unnecessarily. Embedded system programmers typically need namespaces, and therefore should not need to deal with the associated complexities.

Run-time-type identification, meanwhile, solves a different problem that programmers regularly encounter when using libraries and classes developed by third parties. In C++, a program can be passed to an object of unknown type, and may need a way to identify that object's type. In a simple case, a code library could pass a pointer to an application and the type of the pointer could be associated either with a base class or a derived class. In the case of the SHAPE example used earlier, an application could receive a pointer to an object that is of type SHAPE. In reality, however, because SHAPE is a base class, the pointer could be of type SHAPE, SQUARE, or CIRCLE. C++ provides the keywords 'dynamic_cast' and 'typeid' that programmers can use for identifying the actual type at run time.

Much like namespaces, the run-time-type identification feature is far more useful in desktop environments than in embedded systems. Windows libraries, for example, pass pointers to display windows, menus, and content indiscriminately. Embedded system programmers typically have more control of the code base and will benefit more from the overhead eliminated by omitting the feature from EC++ than they would from using the feature.


Exception Handling

Unfortunately, not all C++ features omitted from EC++ can be dismissed quite so easily. Exception handling, for example, provides a robust mechanism through which a programmer can centralize and organize code to handle runtime errors or exceptions. Embedded systems need precisely these capabilities to handle conditions such as out-of-range input values in a data-acquisition system, or dangerously high air-pressure readings in an industrial-control scenario.

C++ exception handling allows the programmer to specify "try blocks" of code and anywhere within that block, or within code called from the block, throw control to an exception handler contained in a "catch block." C++ offers significant flexibility in how exceptions are handled and, in all cases, lets the programmer separate the exception-handling code from the mainline application. The programmer can dedicate a catch block to each try block. Alternatively, a single catch block can service an entire program.

Unfortunately, exception handling is the leading offender when it comes to bloated code. Typical exception-handling libraries and user code even bloat code when the feature isn't used in a C++ application. In addition, progammers can't determine the latencies associated with C++ exception handling and quick response is paramount to many real-time embedded applications. The feature was omitted in EC++ for the simple reason that, in most cases, embedded system programs can't afford the luxury of a general-purpose, high-level exception-handling scheme. Instead, embedded-system programmers must develop application-specific exception handlers that are hand-tuned and tightly-coupled to the application.


Templates

Template support also proves to be a useful C++ feature in embedded systems, yet was omitted from EC++. Again, the technical committee omitted templates because the feature is often used incorrectly and when it is, it artificially bloats the code. Correct usage, however, can allow programmers to simplify source code with little impact on code size or performance.

In the simplest case, templates provide flexibility in class definitions. For example, consider an array class that could be used to define an array object capable of storing samples in a data-acquisition system. Without templates, the class definition must specify whether the samples will be stored as integers, floating-point numbers, or another valid data type. Should the programmer need both an array for floating-point samples and an array for integer samples, then the programmer must write two separate class definitions. Templates, however, allow the programmer to write a single array class that can be used by an application to declare an array for any valid data type.

Despite the fact that EC++ omits a few useful features, embedded-system programmers should find the dialect quite useful. Green Hills developed a sample EC++ program to demonstrate the memory efficiency. The program solves a form of the classical "traveling salesman" problem that is often used to teach programming. When compiled on the Green Hills C++/EC++ compiler using the EC++ mode, the total code size is 57 kbytes. When compiled using the identical EC++ source file with a full C++ library but no exception-handling support, the code size is 322 kbytes.

This example illustrates some important points. Programmers can't realize the advantages of EC++ by simply using a standard C++ compiler and not using features such as exception handling. Moreover, a standard C++ compiler won't enforce EC++ limits on all members of a programming team. A single offending function call could significantly bloat the code.

Embedded system programmers, however, don't necessarily have to do without features such as templates to realize the advantages of EC++. Modern compilers can allow programmers to selectively enable needed features such as templates. In one way to offer such capabilities, companies can establish a level of C++ support that falls between EC++ and C++. For example, most of the Green Hills' C++/EC++ compiler offers an extended EC++ mode with support for namespaces, new-style casts (except for dynamic_cast), mutable specifiers, and templates. These features are often useful and don't bloat the code yet they can be difficult for programmers to understand and use correctly. This approach, combined with careful implementation and usage of the added features provides the best of both worlds. Programmers can consider such products as scaleable C++. The programmer or programming manager can decide on the best mix of features for each application.

Details can be found on the EC++ World Wide Web home page at www.caravan.net/ec2plus.

Mike Haden is Compiler Engineering Manager at Green Hills Software. Prior to joining Green Hills Software, he worked with the Green Hills compilers for several years at Ridge Computers. He has worked on compilers and language-related tools since graduating from San Diego State University in 1979. He holds a B.A. in Computer Science from San Diego State University, San Diego, Calif.