Migrating from 8-/16-bit to 32 bit: Lessons Learned the Hard Way - Embedded.com

Migrating from 8-/16-bit to 32 bit: Lessons Learned the Hard Way

Processors are becoming more powerful both in terms of the MIPSand the bandwidth of the data they can handle. They are equippedwith most of the peripherals to make them resemble a System on a chip (SoC). As thecomplexity rises it is difficult to comprehend and it is necessary tomove to higher abstraction levels.

The Developers become more dependent on Tools – including an IDE,Compilers , JTAG debuggers and all that”once fancy” stuff. The level of abstraction also has to be increasedsince it is difficult or takes a longer and steeper learning curve tohave a grip on the underlying processor architecture.

Though it is attractive to jump into a 32 bit architecture from theexisting 8/16 bit design capability for the amazing price- performancefigures, there are hidden pitfalls which you may encounter.

This article is excerpted from a paper ofthe same name presented atthe Embedded Systems Conference Boston 2006. Used with permission ofthe Embedded Systems Conference. For more information, please visit www.embedded.com/esc/boston/

The transition from 8-bit to 16-bit is not explicit since the 16-bitarchitectures never superseded the 8 bit Segment. Among other things 16bit CPUs are often downsized to handle 8 bit peripherals, reducing thereduced the data throughput because of resulting I/O bottlenecks. There are also16 bit Digital Signal Processors, which have become more popular thantheir 16 bit MPU conterparts because of the need to handle wider datain computation in signal processing applications.

Now we have suddenly come into an era where 32 bit processors arecheaper than 8 bit counterparts. This market push with theever-increasing features has presented the developer with somedifficult 32 bit Processor System design choices in many applicationsin consumer applications and additional features, connectivity and inIndustrial Applications.

There are a few cores that have gained wide use and licensing acrossdifferent vendors. Indeed, the ARM 7 and ARM 9 cores have beenconsidered the “8051 Core” of this decade basedon the popularity and availability from different vendors.

There are some specific advantages in choosing such widely availablearchitectures. A developer can switch across different vendors for anyupgrade or additional features. There is usually a roadmap associatedwith the hardware from a specific vendor but also a migration pathacross different vendors.

Another advantage is the Skill Set available in working with thesepopular cores. This helps reducing the learning curve and initialstartup delays. These skills can also be used across differentapplications and hence the initial Setup Costs for the Tools could beamortized across different products under development.

Architecture
In the transition to these more powerful CPUs, the first thing what youneed to look into is your familiarity with or basic knowledge of the 32bit design architecture chosen. It is a misconception if you assumethat when if you equip yourself with all necessary tools you can moveforward without the basic knowledge of the architecture.

It is easy to get carried away by the statistics displayed in thewebsites that the processor family occupies a substantial share in themarket. The architecture that is widely used in consumer electronicsmay not be a very good choice for the automobile sector. This isbecause the requirement differs based on the application. And when youdelve into specific categories of industrial application you might beeven the first time user of that specific peripheral or afunctionality.

The Functional Block Diagram of the processor under consideration needs to be closely examined andunderstood. There are new terminologies associated with a 32 bitprocessor compared to a 16 or a 8 bit Processor. Hence you need tounderstand the abstract functionality, the constraints and additionalfeatures. There are a host of peripherals, which have not been a partof 8/16 bit design scheme. Security is also an important issue even inportable and embedded devices. Also media related extensions, internetconnectivity and power management have become important considerations.

When you choose a processor architecture you should have a featurecomparison against the competing architectures and arrive at a decisionbased on y our application requirements. The functional andnon-functional requirements have to be addressed. The supplementaryspecifications also need to be considered.

It is easy to be misled with the MIPS value or with industrystandard benchmarks for some specific algorithms. Some of the cores areoptimized for specific algorithms based on the Hardware, the Pipelineor the Cache mechanisms. So it might be surprising that afterimplementation, the processor has not been able to cope up with theexecution times the application demands. But this will be too late anddisastrous if we need to change the processor, even if there is anupgrade path. So it is essential to identify the Key functionalitiesthe application would need to handle and map the Processor and itsperipheral features for the compliance.

ARM has evolved into a variety of cores, currently available on thelower end being ARMv4 and going to ARMv7. The 32 bit Instruction SetArchitecture operating in the 32 bit space has been the central to allthe cores. Additionally the 16 bit Thumb Instruction set was introducedto optimize the code generations since 32 bit instructions are notconsistently required. The correct mix of 32 and 16 bit instruction setfor code optimization and execution speed has been the key to successin the Thumb implementations. To cater to the signal processing needsthe relevant signal processing instructions have been added. Furtherversions support Java and multiprocessing handling.

All these upgrades are driven based on the industry needs andrequirements. Because of the convergence of applications the mobiledevice is evolving similar to Personal Computer in the previousdecades. The silicon needs to support the new challenges ahead becauseof the portability and power consumption still being the drivingfeatures. Different variants of the ARM cores have emerged targetingspecific segments. Also ARM cores are being widely considered as onecore in dual cores such as the OMAP from Texas Instruments. DualCores can offer a twin benefit of the individual cores themselveswithout any compromises.

Development Boards
Once you have short listed candidate architectures for yourapplications you need to look into the Development Boards that thevendors are offering. You have to be very careful on the features ofthe Development Board and the deliverables associated with the purchaseof the board. You could easily end up finally in having the Board alonewith few example programs and a monitor and nothing else.

These might take up a few thousands of dollars and on the receipt ofthese you will be surprised to find that you might need to buy theoperating system, the compiler and debuggers separately. This mightinvolve another substantial investment without which you will not beeven able to start the development.

The Development Boards have a few technical issues which are notanticipated either because of lack of information or just to push it tothe market and cashing on the early bird incentive. These issues couldbe software related or hardware related. Some of the peripherals mightnot work in specific modes or there might be a case that a Power Supplyregulator is not “Heat Sinked” properly.

If the Development Boards are done by Third Party, then you need toshuttle between the vendor and the third party and finally get a mailstating that this might be the issue and the fix is being worked in thenext hardware version of this board. You will be lucky if the issuesare only software related since we might get a patch for rectifying theissue.

Any time line that is predicted on a ready-to-use Development Boardis normally twice than anticipated. If the architecture or family isvery mature this overrun could be lower but still there would besurprises every now and then. The Design Team would have planned toexecute and test most of the code using the development board as such.In many situations, you need to make additional hardware or adapt ourapplication to what is available in the development board. Theseactivities mainly contribute to the time and resource over-run.

The other benefit of buying a Development board is the Board Support Packages (BSPs),which encompass the ready to use codes and binaries cross-compiled forthe specific platform. This definitely gives an edge rather thanstarting everything from scratch. The Embedded Linux has been possiblebecause of the possibility to strip down the extra flesh and adaptsitself to the smaller footprint.

Some of the device drivers are found as a part of the board supportpackage. Many clones of it are available in the Linux developmentcommunity. The generic drivers are available readily and also do nothave serious implementation issues. It is also possible that someinstances the device drivers are available in Windows and not in Linux.

<>Touch Screen Controller Drivers
Considerable effort needs to be expended to develop such functions,which might consume as much as a man month for development and testingof the Touch Screen Controller Driver including the necessarycalibration routines.

The other time consuming issue is the Serial Protocols. Linuxprovides very good support for many communication protocols. Thisapplies for standard protocols and may be difficult to implementproprietary protocols. You will also come across proprietary protocolswith 9 bit addressing feature in a multi dropRS 485 connections.

If you need to support this you need to tinker the kernel and thedevice driver provided the processor core is supporting this mode.Othewise you might have to design a wrapper with an interim Repeaterthat can seamlessly convert 9 bit into 8 bit if the other communicationpartner is a legacy system. Similarly for half duplex RS 485 controlthe direction pin control has to be specifically addressed since mostof the serial drivers can support full duplex either RS232 or RS422.

Also handling parity bits, multiple checksums and serial timeoutscould be time consuming. Some of these issues might look simple andstraight forward but might take any where between 10 to 15 developerdays for implementation and testing.

Reference Designs
When you move to your design after the hands-on trials with theDevelopment Board you will often need to rely on Reference Designseither from the CPU by the vendor or supplied with the DevelopmentBoard. Often the reference design could just be the schematic of theDevelopment Board as such. In this case, if you are confident that youhave a working model and you are sure that you can proceed with thereference design as such without much change. Of course you need to bein a position to tailor the design to your requirements.

In some cases you will not have tangible reference designs and itcould be from an open community or from work groups. In these cases youneed to check up the schematic as a whole to ensure that it would workin its intended mode, for which it had been designed. You need to lookinto the datasheets of the peripherals, the voltage and the timingspecification in particular.

Since most of the high speed cores work with lower voltages, butstill the external world works with higher voltages the conversion andthe interpretation of these signals becomes critical. Most of thereference designs do not consider the real world or implementationissues. Reference Designs have to be taken just for reference and notfor absolute utility.

They provide a quick start but you need to ensure that everythinghas been addressed for your application requirements. Sometimes it ispossible that the part in the reference design is obsolete or phasedout. In this case we need to work out the equivalent either from thesame vendor or a different vendor and update the schematic. Also careshould be taken in additional buffering and sometimes on the gluelogic. Also the Reset Modes could be different in a reference design,which cannot be implemented, in the final design.

For some peripheral connectivity you need to make sure the endrequirement first before deciding on retaining the peripheralschematic. To illustrate in applications requiring TFT LCDs you need tolook into the controller IC with or without EDO RAM. Before you decideupon the peripheral you need to ensure the diagonal size of the LCD andalso the pixel density. If this is varying or there is a possibility toupgrade the features available at a later date you need to look for thehardware compatibility, which can operate across the choices.

Peripheral Modes
Peripheral incorporate features designed to demonstrate the peripheralworks in a particular mode. They do not have options to check in allthe modes that the peripheral would function. When you have arequirement of a unique functionality for a peripheral you need tonarrow down and check whether the necessary interfaces and modalitieshave been designed or you need to do it afresh. Some times the modeshave been passed as parameter functions for the calling function todynamically decide the mode.

When you have a User Interface system, designedfor the modality of a typical user, Input entry is normally handled bythe keyboard. The system is normally the standard Keyboard. In anindustrial application you should may have to restrict the number ofkeys to limited keys. Then starts the difficulties.

You may need to change the navigation scheme itself which isconsiderable effort compared to writing the application itself. The TABkey is used widely in menu navigation and hence you may need to changethis in the key handling routine. Because of the nested calls it issometimes very difficult of track the program sequence and hence verydifficult to change.

Some other modes like the Baud rate change may have to be changedmanually by the user in run time, inviting additional effort. Changingmaster to slave or incorporating multi- master during run-time forspecific communication peripherals will require modification in thedevice drivers.

Typical examples of peripheral behavior requirements are (1) handling real-time clocks andtheir possible synchronization efforts across a multi node network, (2) finding it necessary to invokethe power down and idle modes of the processor during power fail, or (3) switching between memory and theI/O mode in a Compact Flash.

There might be even other unique requirements like accessing theconfiguration memory like an EEPROM on a predefined sequence andswitching OFF the backlight of the LCD when is not used are alsospecific functions that need to be done individually.

If your design includes an ADC, there could be plenty of modesincluding a sequencer, simultaneous conversion, over-sampling andaveraging and auto conversion on triggers which need to handled in thedevice drivers and the application code.

Boot Modes
You need to look into the Boot Modes theProcessor can boot and which one is applicable. Most of the industrialapplications require Auto Booting to avoid Manual intervention and thisneed to be designed in.

In some systems you may have to initiate the Auto Booting sequencemanually for the first time using a Debug Port or its equivalents. Sonecessary hardware for this cannot be skipped even in the productionunits. Also some times the Application and OS have to be loaded in adifferent mechanism from the Initial Boot Program.

For example, you might need an Ethernet connectivity, which mightnot required in the end application. Also it might be possible to bootthe system from a network server. There is no thumb rule on the modethat has to be chosen. You need to look into the options and decidewhich is suitable for the application. A Repeater connected andcommunicating to a Server can be booted from the same Network Server.

The boot modes and the procedure also change substantially acrossdifferent processor architectures. In some other cases the Processorhas a Boot Program, which automatically takes care of these issues. Butthere might be one or two pins, which are given externally for the userto choose among the various options. There might be a default mode,which is not favorable for the application, and hence you need tofigure out on how you invoke the boot mode. It could be that thereference design uses a different boot mode and hence you need to becautious on our selection and its necessary implementation issues.

Flash Memory
When you work with self contained 8 and 16 bit processors you are notreally worried about the Flash or the RAM. If you need to connect either theProgram or the Data Memory the chip should have the necessary ExternalInterface options. In these architectures the execution of the Programis directly from the Flash though sometimes you are given an option toexecute a part of the code from RAM. This is normally in Digital SignalProcessors where some specific critical code could be run from the RAMto speed up the execution cycle.

But in most of the 32 bit Processors the Program is normally movedfrom the Flash to the RAM after the initial boot up. So the RAM needsto be faster and also needs some synchronization with the processor. SoSDRAMs – Synchronous DRAMs are required to be interfaced with theprocessor for executing the program.

You also need to decide whether you need to have a native NAND/NOR Flashor you need to additionally provide Compact Flash (CF) Slot. You canalso have a Hard Disk or a Disk on Chip. This is based on the Programsize inclusive of the Application and the Operating System. When youare not sure on the end size of the image you are going to generate itis advisable to have a CF slot option.

You should also look into the timing issues on upgrading thesoftware in any of the Flash memory described above. They might besurprisingly high as long as 30 min to 60 min to program a 16MB FlashMemory. So for an Industrial Application if you anticipate fieldupgrades on the initial phases of deployment then it is better to havea CF slot so that you can replace the card with the new software ratherthan upgrade in the field. After the application stabilizes and freezesyou could switch from the Compact Flash to the Program Flash IC. Thiswill save costly downtimes and give us the flexibility.

Operating Systems
The next different issue when you move around from 8bit/16bitprocessors to 32 bit is the need for an operating system. You could haveour own proprietary operating system on smaller devices because theyare more than a sequence of instructions to CPU and controlling theperipherals.

If you wish to migrate to 32 bit it also becomes necessary to havedue consideration for using an operating system rather than developingon your own. If the business segment dictates volumes and you would notlike to end up in royalty issues then you can consider developing yourown operating system. But for smaller project it is more or less likere-inventing the wheel. Other than this it brings in new issues,compatibility, up-gradation, maintenance and optimization.

So when you decide that you need to go in for an operating systemthat is available commercially off the shelf then, the choices withinthe competing platforms have to be considered. Should you go in for Windows or for Linux? The choice depends onmany factors including the skill set available, the development cost,unit cost and timeline. If the royalty overhead for every shippeddevice is not desired then you need to look for royalty free OS.

Windows . There are stillsome ardent lovers of Windows who feel that Open Source is all jinxed and tohave strict delivery schedules the Compact Edition of Windows Win CE is abetter choice. Microsoft keeps promoting Windows Embedded to containthe growth of Embedded Linux. Windows CE is still the choice if theapplication is Graphics intensive. The development time is considerablylower in Windows CE on such applications. To address the market needsWindows offer different editions of its embedded version as Windows CE,Windows XP Embedded and Windows Embedded for POS (Point of Sale). TheMicrosoft details the deliverables and suggests the choice based on theapplication.

Linux. The usage of Linux inconsumer and industrial embedded applications is emerging and growingat a faster pace. The first perceived interest to the developer is itis free and access to source code is provided. You might be able tomake a quicker start since everything could be downloaded from theInternet. Initially it is amazing to look at the wide spread Linuxcommunity with exchange of information. All the forums, user groups andall the suggestions are really overwhelming. But when it comes tospecifics from generics then there starts the braking and you need toshift to slower gears.

The second important consideration for choosing is the reliabilityof the Linux system, which is utmost essential in embeddedapplications. Linux being powered by the Open Source community is wellmaintained and integrity is assured. This is ensured that developersput in their best efforts when it is left to their choice rather thanbeing a part of the employment objectives. Any fixes are handled by theopen source community and made available at the earliest. This is incontrast to the branded versions where you need to wait for fixes andhope that they are free of charge.

Linux Kernel. Linux Kernelis robust and it compiles on a very attractive form factor of less than1MB. The Kernel includes the Memory Management and Process Scheduler.The Memory Manager takes care in securing the memory sharing andmanagement across different programs. The process scheduler allotssufficient CPU time to processes. The kernel also incorporates a filemanager and a shell windowing system. When you do native compilingeverything looks better and so you are not poised for any setbacks.

When you need to have it cross-compiled to your target platform youstart encountering problems. Some of them could be solved directly withsome additional effort, references from the examples and illustrationsand on specific queries or FAQ in the vendor site. ARM and Linux gocomfortably well and still you need to rely on the Board SupportPackages delivered with our Development System purchases.

The stable and the development versions of the Linux kernel areavailable for downloading from the Internet. The stable versions areeven numbered like 2.4 and 2.6 while the development versions are oddnumbered 2.5. It is always advisable to use stable versions especiallyfor embedded systems. The kernel versioning also includes a patch levelnumbering to indicate revision status such as 2.4.19 with the highernumber stating the latest revisions.

The Linux kernel is monolithic in nature meaning that all the corefunctions and the device drivers are a part of the kernel. Thefunctions inside the kernel are invoked by system calls from theapplication. The kernel handles all interrupts and exceptions. Thekernel also is responsible for switching between tasks and hence is themulti tasking master.

Building the kernel involves a minimum of three steps, building thedependencies, the image and the modules. You could either compile orcross-compile based on our requirement. If it cross-compiledsuccessfully then it needs to be downloaded to the target for itsexecution and verification. It is always advisable to configure thekernel and cross compile with TCP/IP support enabled.

GUI Applications
User Interfaces are typically devices with input and output capabilityand with mild time constraints. The output capability is determined bythe LCD screen size and the actual contents displayed and refreshed.

If you have an application with User Interface then you need to lookinto the Graphic Windows Capabilities. When compared to Windows willstill find the options to be slightly inferior. You have choices onGUI, but you mush also consider whether or not to pursue an applicationwith a royalty free similar to the OS. When you develop an applicationsand start cross compiling you again end up into enormous sized Ram DiskImages.

You need to optimize the GUI Applications by isolating the librariesyou do not intend using for the application. This requires lot ofeffort from the design team since the documentation is not clear andexplicit. You need to adopt trial and error techniques to finallyachieve your goal. This puts in several iterations before we arefinished. The Fonts, Unicode Text formats and unused libraries need tostripped. This might sometimes be as high as 35MB which could be eventwice the size of the GUI Application.

GUI Applications hence need to be carefully designed for the screensand the navigations create the first impression about the product. Asluggish navigation will spoil the credibility of the product. It isnecessary to look into the real estate of the screen and populating theparameters for considerable execution and refresh rates. Care shouldalso be taken with incorporation of the right mix of the text and thegraphics.

Linux Distributions
Lots of development effort is spent mainly on the setting up of thedevelopment environment, optimization, build and rebuild process. Soyou may be tempted to get away with these chores by buying adistribution off the shelf. There are no thumb rules for the decisionof purchasing a distribution. The embedded versions involvesubstantially a high amount of investment.

Also to be considered is whether they are available only to broad ornarrow targets. If you are used to the Linux distribution and thenswitch targets you may need to reinvest again. Distributions definitelyhelp in structured optimization but we become dependent or tied to theDistributor packages. Flexibility in configuring and automation of theprocess are the additional benefits you derive from the purchase of thedistributions. They also serve for better understanding by the way ofthe documentation they provide.

It might most probably happen that after trying hands with the OpenSource model and not able to meet deadlines you might be compelled totry out some reputed distribution for speeding up the whole process. Inthis case the project is delayed and additionally investment isincurred at the fag end of the project. So it is better to have a lookat the project schedule and make investment decisions earlier in theproduct life cycle so that you are comfortable in adhering to the timeschedule.

Conclusion
We have covered up a few and significant pitfalls that could be easilyavoided on careful consideration. The move ton 32 bit processorsinvolves lot of managerial and engineering decisions, which need to beeffected timely and correctly. Some of these facts have to be learnedthe hard way and preserved as lessons learnt.

Kavitha Sundaram is manager ofresearch and development and head of firmware development at PremierEvolvics Pvt Ltd. in Coimbatore, India.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.