Basics of core-based FPGA design: Part 3 – Picking the right core options - Embedded.com

Basics of core-based FPGA design: Part 3 – Picking the right core options

The three common processor implementation models used in FPGA cores are the microprocessor, microcontroller, and specialty processor. A microprocessor is generally a stand-alone core with limited peripherals. Microprocessors are usually implemented with at least a 32-bit or 64-bit architecture.

They are generally targeted toward advanced computing applications. Microprocessors may include advanced performance architectural elements, SIMD units to provide vector-based math functionality commonly used in math-intensive applications.

The microprocessor design model is based on the implementation of an optimized, high-performance processor core with limited on-chip peripherals. This allows the design team to choose and implement the required peripheral functionality externally. The interface to these external peripherals is generally implemented via a high-throughput interface bus such as PCI-X.

In contrast to the microprocessor model, microcontrollers generally include significant on-chip peripheral functionality. Microcontrollers are generally targeted toward specific application markets such as motor-control or PDA devices.

The target application influences the peripheral set mix. Microcontrollers follow the system on-a-chip (SoC) design philosophy. This philosophy encourages the implementation of as many peripherals on-chip as possible, ideally working toward a single-chip solution. Common peripheral block examples include Ethernet and USB communication and LCD controllers. Microcontrollers span a wide range of performance.

Specialty processors target very specific applications including audio processing, software defined radio, or the implementation of network protocols at the highest possible speed. While they may be categorized as either microprocessors or microcontrollers, they are listed as a separate category here because they possess specialized architectures, resources and capabilities. Examples include network processors and digital signal processors (DSPs).

Each of these processor implementation models are targeted toward different applications. The selection of a processor model to implement the specific requirements of a project requires many considerations.

The primary trade-off areas include target application, performance, architecture, integration, power and cost. A primary FPGA embedded processor implementation advantage is the ability to repartition hardware functionality to potentially create new processor implementations without board re-spins.

With the incorporation of the processor and the circuitry it controls, the design team has control over more of the design elements since software and hardware functionality may be implemented using programming languages. The flexibility of software and hardware re-configuration allows the design team to determine the optimal mix for hardware and software functionality.

The ability to repartition an embedded FPGA processor design increases the number of potential design implementation options. Some functional design implementation options are presented in the following list.

Design Functional Implementation Options

1) Single processor
2) Multiple processors
3) Floating-point unit
4) State machine
5) Coprocessor
6) Dedicated FPGA logic implementation
7) Off-chip peripherals

There are several broad processor IP categories. Some example processor-related IP cores are presented in Table 14.1 below.


Table 14.1. Typical processor IP cores

Picking the right processor core & peripherals
The processor selection affects all aspects of the system design, budget, and schedule for a project. It is typically one of the most critical decisions made by a development team because of the broad impact it has on the performance of a project.

For this reason, the selection of a processor will typically be a collaborative effort between the system, hardware and software teams. The interactions between these decisions can become complex. Some factors to consider when selecting a processor core are presented in the following list.

Processor Selection Factors 

1) Target application
2) Optimization for specific architectures or highest possible performance
3) Resource utilization
4) Simulation support
5) Testbench coverage
6) Support for individual simulation tool sets
7) Availability of real-world application-oriented simulation results
8) Documentation completeness and accuracy
9) Access to original core developers or qualified experts
10) Number and competence of IP vendor staff
11) System, hardware and software tools
12) Operating system

To conduct a processor trade-off study, the comparison of the processor core architectural features such as the pipeline, memory interface, and core speeds must be taken into account. The combination of architectural features provides the details in understanding the true performance of the processor.

As discussed previously, a deeper pipeline may be leveraged for higher performance provided that branching is limited. Large register files reduce the number of load/store operations. Cache implementation can improve overall performance significantly by reducing the number of external memory accesses. Some architectural factors to consider when evaluating processor cores are presented in the following list.

Processor Architectural Factors
1) Type, size, &implementation of the memory/peripheral bus
2) Error detection and correction mechanisms
3) Bus transaction types such as bursting
4) Size and model of address space
5) Type and size of cache (instruction/data)
6) Type of controllers such as DMA and MMU
7) Functional elements such as the register files/execution units
8) Type of pipeline and strategies to prevent stalls; for example, branch prediction
9) Write buffers for external memory
10) Interrupt response & structure; i.e. shadow registers

Other factors to consider during a processor trade study include development tools, IP availability, supported RTOSs, and any other critical items that impact performance or development efficiency. A spreadsheet is a good tool for summarizing design options.

Consider the use of tools that support code optimization while implementing proactive measures early in the design effort to offset any significant software issues that could require software redesign. To better understand these trade-offs, the trade study shown below presents an overview of some important processor selection criteria.

Processor Selection Criteria
1) Performance
2) Architecture
3) RTOS support
4) IP availability
5) Processor category
6) Tool features
7) Technical support
8) Reference code/examples
9) Evaluation boards

Hardware Implementation Factors
During the hardware design effort, a few key hardware factors should be taken into consideration. Hardware implementation factors associated with FPGA embedded processor design include device-level, board-level, design optimization, embedded processor setup, and IP use.

All of these design factors are interrelated. Important items affecting the embedded processor design optimization process include FPGA device design margin, FPGA board orientation, data flow through the FPGA, informed pin assignment, utilization of unused pins, access to internal FPGA signals, and clocking. The following list summarizes these embedded processor design factors.

Key Hardware Design Factors 

1) Tool selection
2) Design margin
3) Device selection
4) Design optimization
5) Data flow and FPGA orientation
6) Debug hooks
7) System clocking
8) Bus interconnection and management strategy
9) Device mapping
10) IP usage

Some of the factors affecting tool selection are traditional FPGA design implementation capabilities, IP integration, target FPGA selection, and interoperability of traditional FPGA design tools and processor implementation tools.

An important tool consideration is the method and flow used to build the embedded processor. Typically the design tool flow implementation options range from manual to highly automated. The manual flow allows a high level of control over the system implementation, but at the cost of time.

The automated flow can implement a broad range of design functionality. Complex designs are often implemented using a combination of the two flows. The first design pass can be implemented with the assistance of automated wizards, with more detailed modification and enhancements being implemented manually.

Software Implementation Factors
Software development for an FPGA embedded processor is very similar to the flow and process of software development for a conventional discrete processor.As with any other design effort, tools play a key role in a successful development effort. At the core of the software tool chain is the integrated development environment (IDE).

This tool suite brings together an editor, optimizing compiler, incremental linker, make utility, simulator and non-intrusive debugger. A good example of a popular IDE is the Eclipse IDE. Popular compiler and debugger tools are gcc and gdb.

Even with the best tools, the software design implementation can increase in complexity to a point where additional levels of software abstraction are required. With the increased software abstraction levels, the embedded system must still be able to exhibit real-time response to the events it handles.

A real-time operating system (RTOS) can be used to implement a level of abstraction while also supporting real-time event handling. In order to meet critical timing requirements, the selected embedded operating system must have a level of determinism sufficient to provide an acceptable real-time response as it relates to the system in question. The two categories for determinism are hard and soft. Soft determinism causes the largest amount of event timing jitter (timing uncertainty).

A good RTOS solution must provide real-time deterministic performance while also connecting the lower-level software to the hardware. The package that provides this lower-level connection is called the board support package (BSP). A BSP includes the boot code for the initialization of the processor, low-level drivers and interrupt service routines for peripherals and related system hardware.

A good RTOS will also include important middleware components including, but not limited to, TCP/IP stack, web server, USB stack, encryption software, and other popular devices.

There are many items to consider during the selection of an RTOS. Some of the most important considerations are the API set, tasking model, kernel robustness, interrupt response and footprint.

Any processor core under consideration will typically have a list of supported or certified operating systems that have been verified.

A final design factor relating to RTOS implementation that can influence a project’s schedule is the integration between the selected RTOS and IDE tool set. Tight coupling between the RTOS and the implementation tool set can improve efficiency by providing additional debugging capability.

One of these capabilities is task profiling, which is used to ensure that the software implemented follows the defined priority and resource management schemes. Considerations important in the selection and implementation of an RTOS is presented in the following list.

RTOS Selection Factors 

1) Determinism
– Is the kernel hard or soft?
2) Defines amount of timing uncertainty
3) Scheduling effects robustness
– Priority-based
– Preemption versus nonpreemption
5) Preemptive is used in real-time systems
6) Use a standardized API set
– Wrappers assist where no API standard exists
7) Understand synchronization and communication approaches
– Avoid deadlock
– Task communications promotes better code readability and reuse at the cost of more memory utilization
8) Use task to partition
– Promotes compartmentalization for code reusability
9) Understand memory usage model
– Task stack size
– Avoid stack overflow issues10)Use the best licensing model for controlling cost/effort
– Is “ free” really free?
– Similar for hardware IP

Next in Part 4 : Implementing a design
To read Part 1 , go to Core types and trade-offs
To read Part 2 , go to System Design Considerations

Used with permission from Newnes, a division of Elsevier.Copyright 2006, from “Rapid System Prototyping with FPGAs, ” by R.C.Cofer and Ben Harding. For more information about this title and other similarbooks, please visit www.elsevierdirect.com.

RC Cofer has almost 25 years of embedded design experience, including real timeDSP algorithm development, high speed hardware, ASIC and FPGA and project focus.His technical focus is on rapid system development of high speed DSP and FPGAbased designs. He holds an MSEE from the University of Florida anda BSEE fromFlorida Tech.

Ben Harding has a BSEE from the University of Alabama,with post-graduate studies in DSP, control theory, parallel processing androbotics. He has almost 20 years of experience in embedded systems designinvolving DSPs, network processors and programmable logic.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.