Porting Embedded Windows CE 6.0 R2 to the OMAP-L138, Part 1 - Embedded.com

Porting Embedded Windows CE 6.0 R2 to the OMAP-L138, Part 1

In this three part series, Artisom Staliarou and Denis Mihaevich describe in detail how they ported the Windows CE 6.0 R2 embedded operating system to the Texas Instruments ARM-based family of OMAP-L138 processors.

Part 1: Evaluating the basics of Windows CE 6.0 R2 and OMAP-L138 processors

One of the most flexible implementations of the ARM architecture is Texas Instruments' OMAP-L138 SoC. With careful design it can be used not only as the basic core in mobile applications but is flexible enough to be used in a wide range of standalone non-mobile designs in industry, medicine, and machine automation.

This article describes our experiences developing designs based on the OMAP-L138 and the C6-Integra family in general. The embedded operating system we used was Microsoft Windows Embedded CE/Compact, a multicomponent real-time operating system (RTOS) that supports such architectures as ARM, MIPS, SH4, and х86. A developer with a board support package (BSP) for a specific platform can quickly create an operating system image by selecting necessary components.

A wide range of tools for debugging and profiling of the core code, user applications, automated driver test system (CETK), and OAL layer (OEM Abstraction Layer) are available for Microsoft Windows Embedded CE/Compact, as well as numerous helpful utilities for control and setting of the device hardware operations. Another strength of the OS is that it is possible to port a code from applications written for desktop versions of Microsoft Windows OS, which significantly reduces the time and costs for end device development.

The OMAP-L138 SoC
Texas Instruments' OMAP-L138 is a low-power, dual-core system-on-chip (SoC) that combines an ARM926EJ-S RISC MPU with a TI C674x VLIW DSP and includes a rich peripheral set.. These cores support floating point mathematical operations, and have a Programmable Realtime Unit subsystem (PRUSS) consisting of two 32-bit cores that allow off-loading of the ARM-core by performing preliminary data flow processing chores. Also, one of the PRU cores can be turned into a CAN-peripheral. The ARM926EJ-S core incorporates all of the necessary necessary internal modules for designing an application based on the Microsoft Windows Embedded 6.0 RTOS. A flow diagram for this SoC is shown in Figure 1 .


Click on image to enlarge.

Figure 1: OMAP-L138 SoC Block Diagram

A useful feature of this SoC in embedded high performance applications is a multilevel central bus with an integrated enhanced direct memory access (EDMA) controller. This bus has internal links up to 64 bits in width that connect the peripheral with memory, making it possible to bypass the processor without the use of other bus links. This also makes it easier to resolve arbitration problems.

With this bus, the EDMA controller can not only transmit data not resident on the cores, but can perform general operations such as array sorting under hardware control. Another good addition to the rich peripheral mix is a video port and SATA interface with a transmission rate of up to 3 Gbit/s.

These features make it possible for developers to build SoC-based designs that efficiently capture, process, and store media content. The SoC contains an McASP audio interface (multichannel audio serial port) that receives and transmits data to devices via 16 independent channels. This latter feature lets designers implement such devices and functions as DVD-players, audio interfaces, and audio processors.

Board Support Package
Board Support Package (BSP) development is one of the most labor-intensive processes when building embedded systems based on an OS. Stable operation and quality of the final product are mainly determined by the quality of the BSP, so the developer needs complete knowledge of the SoC as well as detailed understanding of the operating system.

MPC Data (www.mpc-data.co.uk/windows-embedded/) provides a basic BSP for this SoC. It includes almost all drivers of peripheral units for this chip, core abstraction layer (OAL), and standard loader EBOOT. The OAL layer of this BSP supports:

  • loading parameters (transmitted from the loader);
  • start of the ARM-core cache (instructions and data cache, write buffers for Copy-Back mode);
  • interrupt controller (full-function support with interrupts from primary inputs);
  • dedicated timer for the OS (with 1 ms steps) – 32-bit timer 0 is used;
  • program emulation of a real time clock;
  • memory management unit (MMU) for the formation of a table of addresses translation from physical into virtual;
  • input/output system control;
  • core profiler (also timer 0 is used);
  • output of debugging messages via UART2-port;
  • Kernel Independent Transport Layer (KITL) via the EMAC network peripheral (with support of operation via interrupts or by method of continuous scanning) for core debugging;
  • Virtual Network Miniport (VMINI) bridge for simultaneous operation of the core debugger and network adapter in the OS;
  • SDK: a subroutine for access to the peripheral management , PLL and general purpose Input/output (GPIO) controllers;
  • real time clock module (RTC);
  • watchdog timer; and
  • power management system with the support of the OEMIdle() idleness subroutine.

The BSP contains tests to be performed on all of the basic peripheral modules. Support of testing is implemented for the file system modules, input/output ports, and I2C/SPI buses via automated tests with the support of the open source KATO results-logging tool.

The support of KITL and VMINI allows debugging of new drivers and services of the already assembled unit with maximum convenience and speed, significantly reducing the time required for development.



For OS operation, three essential elements are necessary: the bootloader; OEM abstraction layer( OAL); device drivers (Figure 2 ). Also required are a variety of configuration files (not shown).


Clickon image to enlarge.


Figure 2: BSP structure

Bootloader:   The Bootloader performs primary initializationof hardware resources. The OMAP L-138 implements multiloading of imagesto various cores from sources such as UART, NAND, NOR, I2C, SPI, andHPI, which allows the OS loader to start by a variety of means. Thereare numerous methods of starting SoCs from NAND /NOR-memory. Forexample, using an internal RAM for loading a small code will initializeprogrammable logic loop (PLL) and double data rate (DDR) bus controllersin order to copy the master code of the loader and operating systeminto an external RAM. Another variant is to launch the code directlyfrom NOR-memory, which copies the OS code into the memory after itsinitialization.

In the OMAP L-138, TI has implemented primaryinitialization of the processor's operating frequency, memory settings,and other units via a specialized container (Figure 3 ) called theApplication Image Script (AIS) file. Several images for various corescan be added to this container with preliminary indication of the codeloading area. Thus, the need to develop a startup code forinitialization of the PLL and DDR in the operating system loader is notnecessary.


Clickon image to enlarge.

Figure 3: AIS-container


OEM abstraction layer (OAL):
The OAL contains the startup code forthe system (cache and memory management unit initialization), interruptprocessing, timers, and power control. Providing an interface to the OScore, it implements a set of subroutines and processor control codes(IOCTL). The system core, in its turn, provides subroutines to the OALLayer that allow for maximum unification of the interaction between OALand the system core.

Device drivers: The device driversimplement the unified interface. They abstract the hardware or virtualarchitecture and enable control of numerous peripherals such as the EDMAcontroller, I2C, SPI, LCD-controller, video port, PWM controller, USBOTG, USB Host, and network controller.

Configuration files: In addition to a BSP, a multicomponent operating system is necessary todevelop the device quickly . Extensive subroutines and basic softwarecomponents provided by the OS for implementation of numerous devices areimportant, as are tools for development and debugging of userapplications, graphic interfaces, services, and drivers. Microsoftprovides ОСРВ Embedded Windows CE 6.0, Platform Builder developmenttools, Visual Studio, and Expression Blend.

Loading order
InОСРВ Microsoft Windows Embedded CE 6.0 the loading order is defined inthe system registry, which loads device drivers during system start upaccording to clearly understood priorities and system requirements.

Multicoresystems such as the OMAP have significant advantage in handling andloading and offloading tasks such as complex calculations from the maincore to other cores, allowing a designer to achieve highperformance even in strictly deterministic real-time systems. But CE6.0‘s default mode is operation in a single processor environment. So itis necessary to extend it for use in multicore designs, such as theOMAP–L138. To do this, Texas Instruments provides loading subroutinesand other mechanisms for working with the DSP-core (DSPLink),implemented by the BSP. It presupposes allocation of reserved sectionsin the RAM for the operating system. As shown in the example illustratedin Figure 4 , this makes it simple to find the the code and data for the DSP core in this area.


Clickon image to enlarge.

Figure 4: Example of sequential loading of drivers in Windows CE

The example above illustrates sequential loading of drivers in asystem for capturing and storage of video data in 720p format with thefrequency of 30 frames per second. In this case, Windows Embedded CE 6.0deals with storage of data on the hard disk using a SATA interface.Additional need to transmit data via the network may arise. In suchinstances the data is compacted using special video codecs (Motion JPEG,H.264, MPEG4 etc.) in order to reduce the size of the stored andtransmitted data. When all operations are assigned to one ARM9 core,that processor is unlikely to cope with the load even at the frequencyof 450 MHz, so it is much better to shift to the DSP core for videocompaction. (Codecs for such purposes are provided by TI.)


Clickon image to enlarge.

Figure5: Windows CE 6.0 R2 by the AXONIM Devices Company, with theapplication of the modified BSP based on the OMAP-L138 experimenter kit

The developer needs only to write an application (or service) based on ОС Windows Embedded CE 6.0 (Figure 5 ),which will provide for storage and transmission of the already compacted data, with the rest implemented on other cores in the SoC.

To read Part 2, go to “The OMAP Programmable Real Time Unit .”
To read Part 3, go to “Using the Windows 6.0 Board Support Package.

Artsiom Staliarou
and Denis Mihaevich are founders of the AXONIM Devices Company, a Microsoft EmbeddedPartner and independent embedded electronics system design center and systemintegrator with 25 engineers based in Minsk, Belarus. E-mail: , Skype: axonim.by.

Artsiom has a degree in radiophysics and has more than 10 years of experience in embedded system design based on ARM/Blackfin/TI DSP C2x/C5x/C6x)/x86 devices and using Embedded Linux/Windows EmbeddedOSes.

Denis also has a degree in radiophysics and more than 12 years ofexperience in embedded system design and video analysis algorithm development, and has a certificate in optoelectronics.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.