By Richard A. Quinnell
Embedded Java can play a key role in next-generation cell
phones, smart cards, wireless devices, and gaming systems as
well as many other embedded applications. The key will be to
choose the right options for implementation.
Since its introduction four years ago, Sun's Java language and
run-time environment have suffered from excessive hype and,
from an embedded systems designer's viewpoint, inadequate
performance. Running a Java virtual machine (JVM) to interpret
Java byte codes made the approach too big and too slow for most
embedded applications. Sun's new Java-2 Micro Edition (J2ME)
has changed the situation, however, by creating a version that
takes a big step toward solving the concerns of embedded
systems. Now, designers are taking a closer look.
Java has a number of advantages for embedded system
designers. As a language, Java allows object-oriented
programming without the dangers of C++. For instance, Java
allows class inheritance but not from multiple parents, so
there's no opportunity for confusion. Similarly, Java prevents
the ambiguity that C++ allows in the definition of operators.
Java operators are defined on only one type of variable, where
C++ allows an operator to have a separate definition for each
data type despite having the same name.
The Java run-time environment also offers useful attributes.
It ensures that applications don't step on each other's toes or
crash the entire system by checking the code in the JVM before
execution. If the code attempts to alter the system's core
behaviors, it won't be run. Java provides memory management so
that the programmer doesn't have to allocate and free memory,
running the risk of memory leaks. Java automatically frees
unused memory through its garbage collection process. The
run-time environment even simplifies program distribution by
incorporating a core class library. Applications programmers
don't have to supply those functions; they will already be
available on the target.
In addition, Java is widely known and supported in the
computing industry. This implies a wealth of resources in the
forms of working applets and experienced programmers, both of
which can help boost system development. There is no need to
reinvent the wheel with each new application.
The trouble is, most embedded applications face two major
constraints that Java hasn't handled well: not enough time and
not enough room. The time constraint arises because, typically,
an embedded system must respond to external events within a
narrow time frame. If the system is not done handling one event
before the next one occurs, it fails in its mission.
The time constraint also implies a need for determinism.
Designers depend on software elements completing their
respective tasks within a known or bounded period. Tasks that
are inherently unbounded, such as wait loops, need to be able
to be suspended while time-critical tasks are being
executed.
The space constraint comes from cost and portability
requirements. Designers need to use as little memory as
possible, often restricting the design to a microcontroller's
on-chip memory resources. This also helps reduce power
consumption, an important consideration in battery-powered,
portable systems.
Java has had problems working within these time and space
constraints. The Java software environment works with an
operating system and uses a Java Virtual Machine (JVM) to
translate Java byte codes into the native language of the
system's processor (Figure 1). It also has required
sizeable class libraries as part of the core system. Both
factors add considerably to the system's memory
requirements.
Figure 1: Java application software uses processor-independent
bytecode that executes through a Java Virtual Machine (JVM)
running on an arbitrary processor. The JVM combines with
class libraries and run-time code to form the complete Java
platform.
Java's interpreted code is inherently slower than compiled
code, making it more difficult for systems to meet their
real-time constraints. Faster processors could help, but power
considerations often prevent embedded systems from using one.
Even if the system is fast enough, Java's garbage collection
algorithms are both unbounded and uninterruptable, making
determinism an impossible goal.
The Java 2
Micro Edition (J2ME) addresses some of these concerns by
reducing the size of class libraries and altering the garbage
collection algorithm. It has brought Java within reach of many
embedded designs by defining two categories of Java: the
Connected Device Configuration (CDC) and the Connected Limited
Device Configuration (CLDC). These replace the older Embedded
Java, which was essentially a non-standard version of Java for
custom applications.
| J2ME Category |
CDC |
CLDC |
| Virtual Machine |
CVM |
KVM |
| Processor |
32-bit |
16- or 32-bit |
| Memory Requirements |
2 - 16 Mbytes |
160 - 512 Kbytes |
| Restrictions |
GUI dependencies on java.awt removed |
Limited error handling
No user-defined class loader
No support for:
- Floating-point data types
- Finalization of class instances
- The Java Native Interface
- Thread groups.
|
| Target applications |
TV set-top boxes, Internet TVs, Internet-enabled
screen-phones, high-end communicators, automobile
entertainment and navigation systems, and other devices
with persistent and high-bandwidth network
connections. |
Cell phones, two-way pagers, personal organizers,
home appliances, and other limited-resource and
network-connected devices. |
Table 1: The CDC and CLDC are both standard
configurations for industry-wide use.
The CDC is a full-function version of Java aimed at devices
with network connections, a 32-bit processor, and 2 Mbytes of
memory available for the Java platform. This version of Java
will allow devices to download and run general-purpose applets
in a manner similar to desktop machines. PDAs, home appliances,
and car navigation systems are examples of target
applications.
The CLDC is a reduced version of Java for applications with
a more tailored runtime environment. Rather than allowing
general-purpose applets, the CLDC requires that the Java
programs conform to the device's constraints. This loses the
'write-once, run anywhere' promise of Java, but still retains
the other benefits of Java programming. The CLDC and its K
virtual machine need as little as 160 Kbytes of memory and a
16-MHz, 16-bit processor.
With these two configurations, Sun has evolved standard Java
configurations that fit the space constraints of many embedded
system designs. The question of real-time and determinism has
recently been addressed by the creation of a real-time Java
specification within the Java community. Released in November,
2001, the Real Time Specification for Java (RTSJ V1.0) provides
standard extensions to the Java platform and alterations to the
garbage collection algorithm to make certain Java provides the
determinism that many embedded applications need.
That leaves only the issue of raw performance to be
resolved-and it looks like it has been. Answers have come from
the industry as an array of approaches to boost Java execution
speed. The approaches include using optimized JVMs, compiling
Java to native code before execution, using just-in-time (JIT)
compilers, and using hardware acceleration. Each approach has
its benefits and drawbacks.
Optimized JVMs typically speed execution 2x to 2.5x relative
to their generic cousins. Such optimizations are processor
specific, however. While many processor vendors offer optimized
JVMs, not all do. Those that do may also offer optimized class
libraries and real-time operating systems that work closely
with the JVM to further increase software performance.
Optimized or not, using a JVM still involves interpretation,
which restricts program execution speed. Compiling the Java
code to native code before execution avoids that restriction.
In such cases, Java becomes just another high-level language,
like C++, and the limits to execution speed are set only by the
compiler's code efficiency. The trouble is, this compilation
must be performed as with other high-level languages: before
placement into program memory. The result is an inflexible
system, unable to download Java code upgrades or new
applications.
The just-in-time compiler seeks to regain that flexibility
by operating to compile Java code "on the fly" for immediate
execution. This yields performance and flexibility, but adds to
the launch time of a specific application because of the need
to start compilation first. Using a JIT also increases the
system memory requirements by occupying at least 100 Kbytes in
addition to the JVM and applications.
To speed Java execution without the disadvantages of either
compilation or the software JVM, embedded developers can turn
to hardware accelerators. These devices offload some or most of
the JVM's task to dedicated hardware, resulting in a 5x to 10x
improvement over interpreted Java. Hardware accelerators don't
take over the entire task, however. The host CPU will still
handle particularly complex or seldom-used byte codes.
Semiconductor vendors have taken several approaches to
accelerating Java in hardware, choosing to focus on different
tasks. One approach is the hardware interpreter. The
interpreter takes incoming Java code and transforms much of it
into the CPU's native code, saving the JVM the trouble (See
Figure 2). Examples include Nazomi's Jstar, InSilicon's JVX, and ARM's
Jazelle (for more details on the ARM Jazelle, see the
TechOnLine Webcast on ARM Jazelle
Technology). Most recently, ARC Cores has added a Java
extension core from Digital Communication Technology (DCT) to
its base processor. In most cases, the interpreters are silicon
IP that, in effect, extend a processor's instruction set to
include Java bytecodes.
Figure 2: A Java coprocessor, such as the Nazomi Jstar, boosts JVM performance by handling the time-consuming translation of bytecode into the native instructions of the host CPU.
Coprocessors not only interpret the bytecodes, they execute
the resulting machine code, offloading the CPU completely. They
are, in effect, processors that use Java bytecode as their
native machine language. Some, like InSilicon's JVXtreme, are
pure coprocessors. Others, such as Aurora VLSI's Espresso and
DeCaf, can act as coprocessors or as stand-alone processors
that handle the Java code while another CPU handles things such
as the user interface. AJile's aJ-100, DCT's Lightfoot, and
Zucotto's Xpresso are also coprocessors. As with the
interpreters, these coprocessors are often available as cores
for ASIC or FPGA implementation.
A third form of Java acceleration, hardware JIT compilers,
work to compile Java bytecode on the fly. These differ from
hardware interpreters in that they don't merely translate the
software from one form to another. They literally compile,
including making optimizations and restructuring code execution
order to boost performance. MachStream from Parthus falls into
the JIT category.
This array of hardware and software alternatives for
speeding Java code execution seems like it should solve the
problem of Java performance in embedded systems. Unfortunately,
it is difficult to predict how much of a performance boost they
will offer. That difficulty is compounded by the interaction of
the accelerators with other systems elements. The CPU
architecture, the amount of system memory available, the RTOS,
the JVM, the class libraries, and the hardware acceleration all
affect the system's final performance. Application software has
an impact. A system hardware and software configuration that
works well for an Internet appliance, for instance, may be
slower in a set-top box and totally unsuitable for a cell
phone.
Unfortunately, embedded designers have few tools to help
them evaluate the performance of alternative configurations.
The most useful tool is the SPEC
JVM98 benchmark, developed by System Performance Evaluation
Corporation. The benchmark measures the efficiency of the JVM,
JIT compilers, and operating-system implementations. It also
provides platform-specific measurements, including the
performance of the CPU, cache, memory, and coprocessor
configuration.
But SPEC JVM98 is not geared toward the needs of embedded
systems. It was developed for networked and stand-alone client
computers, and assumes that there is a full implementation of
Java with a complete desktop system environment. To run the
benchmark, for example, the target system needs at least 32
Mbytes of memory and a full graphics display in order to view
the results. Few embedded systems are that resource rich.
The CaffeineMark from Pendragon Software is the benchmark
that is popular in the embedded space. Like the Dhrystone MIPS
benchmark, CaffeineMark is an artificial benchmark that
measures only a few Java features. It excludes such things as
floating-point operations, garbage collection, and multiple
treads, any of which may be important to embedded developers.
Further, there is no standard configuration under which the
benchmark runs. As a result, benchmark results from vendors are
difficult to interpret.
The lack of evaluation tools may not be a problem much
longer for embedded Java. The EDN Embedded Microprocessor Benchmark Consortium
(EEMBC) has begun development of a Java benchmark suite. The
EEMBC benchmarks propose to make a number of system
measurements, including such things as garbage collection time
and determinism, I/O performance, interrupt latency, memory
usage, and system power consumption during the benchmark.
Detailed software execution benchmarks will also be included,
measuring such things as class loader time, class method
execution, number of threads used, time spent in each thread,
and the time to invoke threads.
The consortium plans to have the benchmarks run under a
variety of application environments, including smart cards,
cell phones, palm devices, Internet appliances, and set-top
boxes. Not all benchmarks will run in each environment,
however, because tests appropriate for one application may be
meaningless in another. When run in an unbiased standard system
configuration that the consortium will define, the benchmarks
will allow independent evaluation of CPUs, JVMs, JITs, RTOSs,
and hardware accelerators. All benchmark results will be
certifiable through the EEMBC Certification Labs.
Because application software also has an impact on Java
performance, the consortium will structure the benchmarks to
reflect the needs of several common application types.
Browsers, games, notepad editing, and graphics-intensive
applets will form part of the benchmark mix.
When these tools are ready, they will provide a means of
making comparisons in light of the intended application.
Developers will have a much easier time choosing among embedded
Java's many alternatives and ensuring that the final system
performance will meet the target. Then, Java can take a lead
role in future embedded system development.