An 8- or 16-bit CPU may be ideal for your application at present. However, to stay competitive, you need to differentiate your product with continuous enhancements, including new features, faster speeds, improving product specifications, and reducing cost. If you don’t provide these, your competitors will.
One way to maintain your competitive edge is by incrementally improving your existing design. Over time, architectural limitations may make this process increasingly slow and expensive. Alternatively, you can port your design to a 32-bit platform. This can improve your product in several ways (Table 1 ).
Do you really need to port your design?
When porting from an 8-bit CPU to a 32-bit CPU, there are some considerations to keep in mind. One of the first is whether your existing CPU is still viable and if there really is a compelling need that can be met or advantage that can be leveraged by moving to a 32-bit CPU. Review your current and future product requirements relative to the advantages and disadvantages of each CPU.
8-bit applications are usually basic sensing and control systems with simple calculations. 8-bit CPUs often do well at bit-level operations and applications where the values involved are less than 256. A well-known architecture is the 8051.
Even the smallest 32-bit CPUs can do everything that 8-bit CPUs can do, and more, as Figure 1 shows:
- More complex calculations. Examples include native-mode DSP, image processing, and gesture recognition
- Data mining and analysis, and database lookup
- Multitasking through a real-time operating system (RTOS)
Even if you do not require any of these advanced features, 32-bit CPUs can improve your design in the following ways:
Power: Consider a common low-power design where the CPU sleeps in a low-power mode and periodically wakes up to execute code in active mode (Figure 2 ). 32-bit CPUs may require more power than 8-bit CPUs in both modes, but they take less time to execute the code. As a result, the 32-bit CPU spends more time in the low-power mode. In many cases, this can result in advantageously reducing the average power.
Scalability: Today, most CPUs are marketed as a family of similar devices scaled from low- to high-performance. If your product needs to be scalable, then it makes sense your CPU should be scalable too. CPU scalability is usually defined in terms of:
- Instruction set. Higher-end family members should have more instructions or more modes of operation for existing instructions, while maintaining backward compatibility with lower-end instructions.
- Additional registers, or more bit definitions in existing registers
- Additional functions, for example interrupt control and debug
The ARM Cortex-M processor family is a good example of CPU scalability, as Figure 3 shows.
Cost: One perceived barrier to porting to 32 bits has been increased cost. With recent advances in technology, however, it is no longer necessarily the case that 32-bit devices are more expensive than 8-bit devices. A number of low-cost 32-bit devices are becoming available. For example, because of its simple design and small silicon area the ARM Cortex-M0 CPU is particularly cost-effective. One example of an MCU built around the Cortex-M0 is Cypress Semiconductor’s entry-level PSoC 4000, which is as low as $0.29 in quantity.
In addition, Table 1 shows that the support for high code density and faster execution that 32-bit CPUs offer can help to lower cost.
It’s not just about the CPU
It is common to focus just on porting your firmware code to the new CPU. However, remember that the CPU comes as part of an MCU device, and the MCU may offer as many opportunities as its CPU for meeting customer demands for improvements. For example:
- Does the MCU have peripheral hardware features that will enable product feature improvements?
- Can the peripherals operate using less code and put less load on the CPU? This may result in the system using less memory, possibly reducing cost.
- Can the device help you reduce board-level or system-level cost? For example, can you move certain functions off the PCB into the MCU?
- Is the MCU flexible enough to let you adapt to changing requirements without having to lay out a new PCB?
Finally, note that an MCU device is often only as good as the integrated development environment (IDE) that supports it. Confirm that the new IDE is more than just an editor, compiler, and debugger. IDEs that enable you to quickly construct an entire application using all of the MCU hardware features as well as the firmware can significantly speed design. Ample development kit and application note support can also help.
Code porting tips
If you decide to port a design to a 32-bit CPU, keep these considerations in mind:
Select an entry-level 32-Bit CPU/MCU and IDE. For your first port into the 32-bit world, keep it simple. Doing so will reduce the risk of introducing defects as you become familiar with the differences in 32-bit design. Select a basic entry-level device, as well as an IDE that can simplify the porting process. One example is Cypress Semiconductor’s PSoC 4000 MCU, supported by the PSoC Creator IDE.
Select a new compiler . When you port your code to a new CPU, you may also have to choose a new compiler. A number of compilers, some of which are free, are available for 32-bit CPUs. Examples include GCC, ARM/Keil MDK, and IAR.
Get your build and debug tools working. Create a small test program, for example to blink an LED. You will gain experience with the new tools that will help you with the remaining steps.
Rewrite assembler code. Ideally, your existing code should be in C (or some other higher-level language). Any of your code that is in the assembly language of your 8-bit processor is probably not portable. If you have any assembler code in your current design, consider rewriting it in C before beginning the porting process.
Encapsulate MCU-specific code . If your code is modular (a coding best practice), you may have already done this. The portion of your code that directly interacts with MCU registers, such as to read I/O ports, should be in files separate from the rest of the code. Encapsulate the code in those files in functions with generic names, such as UART_Receive(). Then you can rewrite those functions for the new MCU without having to change the rest of your code.
Other architecture changes
A new MCU may allow you to offloadfunctions from the CPU to peripherals. Also, a new IDE may auto-generatecode for you. To take advantage of these features, considerre-architecting some or all of your code.
Because it is easier toimplement task switching in 32-bit CPUs, consider re-architecting yourcode as a set of separate tasks to be used with a real-time operatingsystem (RTOS). Example RTOS vendors for 32-bit systems include Seggerand Micrium.
Incremental build and debug
When designingnew code, a coding best practice is to add, test, and debug code insmall increments. This makes it easier to find and fix defects. The sameis true for porting – port, test, and debug code on the new MCU insmall increments.
Example CPU and MCU
To get a betterunderstanding of the porting process, let us examine the process in thecontext of the Cortex-M0 and the PSoC 4000 in more detail. The ARMCortex-M0 processor is the smallest ARM core available, and a naturaland cost-effective migration path from 8-bit and 16-bit CPUs. Itsregister architecture (Figure 4 ) and instruction set make it an effective C engine.
Allregisters are 32-bit, which enables 32-bit addressing and a 4-GByteaddress space. Most 8-bit CPUs are limited to a 64-Kbyte address space.
Thereare 12 general-purpose registers. (Low registers R0 – R7 have moresupport in the instruction set.) Special registers include:
- dual stack pointers (R13) to help implement a real-time operating system (RTOS)
- link register (R14) for fast return from function calls
- program counter (R15)
- program status register (PSR) contains instruction results such as zero and carry flags as well as the current exception number
- interrupt mask register
- control register controls which stack pointer is active
TheCortex-M0 core instruction set is simple but powerful, with a largenumber of addressing modes. It enables excellent code density . Ccode ported from an 8-bit CPU to a Cortex-M CPU frequently uses lessmemory.
The ARM Cortex-M series CPUs have an instruction pipeline, as Figure 5 shows. This increases overall code execution speed because the CPU canexecute one instruction while simultaneously fetching and decodingsubsequent instructions.
TheARM Cortex-M CPU series integrates support for interrupts directly intothe CPU core, using a nested vectored interrupt controller (NVIC). NVICfeatures include:
- Dynamic priorities and automatically prioritized nesting of pending interrupts
- Low latency – the CPU automatically stores and restores its state with no instruction overhead
- Tail-chaining – back-to-back processing of nested interrupts without the overhead of state saving and restoration between interrupts
- Late arrival – a higher priority interrupt that arrives during the stack push operation of a lower priority interrupt is serviced first.
Thesefeatures enable faster and determinate interrupt handling. A systemtimer “SysTick”, which facilitates RTOS usage and can operate during CPUsleep, is also included. With the high level of interrupt supportavailable, you can consider changing your architecture to be moreinterrupt-based.
ARM’s Cortex-M processor series integrates debugfeatures directly into the CPU core, which enables better debug supportacross a number of IDEs.
The Cortex-M0 core is part of a largerfamily of Cortex-M processors that all have the same registerarchitecture and execute some or all of Thumb-2 instruction set. Thismakes it easier to upgrade to a more powerful CPU such as the Cortex-M3processor in Cypress’s PSoC 5LP.
The PSoC 4000 is the entry-levelmember of the PSoC 4 family. In addition to the Cortex-M0 processor, itfeatures a set of flexible and dynamically configurable peripherals, asFigure 6 shows.
ThisCPU also features capacitive touch sensing. Capacitive sensing touchoffers significant advantages over mechanical buttons in terms of cost,performance, and ESD protection. CapSense features include:
- Easy to implement buttons, sliders, and proximity sensing solutions, with up to 16 inputs routable to various I/O pins
- High signal-to-noise ratio (SNR) ensures touch accuracy in noisy environments
- Robust water tolerance for severe environments
- SmartSense Auto-Tuning speeds time-to-market and eliminates the need for calibration
The CapSense block includes two DACs and a comparator, which you can use for other purposes if CapSense is not required.
Cypressalso offers PSoC Creator, an integrated design environment (IDE) forthe PSoC 3, 4, and 5LP devices. PSoC Creator is a free Windows-based IDEwhich enables concurrent hardware and firmware design of PSoC-basedsystems.
You can design using classic, familiar schematic capturesupported by over 100 pre-verified, production-ready PSoC Components.The Components include auto-generated API code, which can significantlyreduce the amount of code that you have to write. Using PSoC Creator itis easy to port designs between PSoC families, at both the configurablehardware level and the firmware level, as Figure 7 shows.
You can also export PSoC Creator designs to other IDEs such as µVision and IAR.
Itis now possible to upgrade legacy 8-bit and 16-bit designs to 32 bits,and still meet cost targets. Several considerations must be kept in mindwhen planning a port to a new CPU; one of them is to select anentry-level 32-bit MCU and an IDE that supports it well.
Mark Ainsworth is an Applications Engineer Principal at Cypress Semiconductor. He has aBS in Computer Engineering from Syracuse University and a MSEE fromUniversity of Washington, and has over 20 years experience in embeddedsystems engineering. He can be reached at .
Ranjith Mundoor is an applications engineer at Cypress Semiconductor. Hisdevelops whole product collateral for the PSoC family of devices. Hisinterests include MCU programming, bootloaders, and embeddedcommunication protocols. He can be reached at .
1. Cypress Semiconductor’s application note AN89610 on how to create optimized C code using the GCC or MDK compiler.
3.Dhrystone is a computing benchmark program used to calculate therelative performance of an MCU. (DMIPS = Dhrystone million instructionsper second.) Data referenced from The Definitive Guide to the ARM Cortex-M0 , ISBN: 978-0-12-385477-3.