Program flash memory using parallel flash loaders and CPLDs -

Program flash memory using parallel flash loaders and CPLDs

Flash memory has been the answer to many an engineer's wishes. Since they were introduced in 1988, these electrically erasable, programmable, read-only memories have delivered high-density nonvolatile storage that's relatively affordable and straightforward to program and erase. Flash memory devices are now widely used in a variety of applications to store configuration, program, or memory data.

Although flash memory can be straightforward to program, a new method of programming makes it easier and offers cost and time savings. In this article I'll describe this method and compare it with the traditional methods for programming flash memory.

More flash in the forecast
Before flash memory was available, embedded systems designers typically used EPROMs or EEPROMs for nonvolatile digital storage. Both device types were adequate but less than ideal because of their awkward programming and erasure requirements.1

In contrast, flash memory can be electrically erased (in blocks or in sectors, depending on the specific device) considerably faster than an EPROM or EEPROM can be erased. This block-based erasure method allows a flash chip to share the erase circuits within a block, reducing die size and lowering manufacturing cost. The other advantage of block-based erasing is that more than one copy or version of the contents can be stored in the same chip, providing a fail-safe mechanism for updating the memory. A master version can be stored in a sector of the device that never gets erased and can be accessed if any errors occur during programming or erasure of the other sectors.

Densities for flash memory devices have increased exponentially since the late 1980s, with vendors now offering as much as 8Gb of storage capacity in a single chip. Demand for flash memory continues to grow rapidly because it offers nonvolatile storage that's well suited to embedded applications in consumer, automotive, computing, and industrial products that are requiring larger and larger storage capacities.

Flash memory is typically used for permanent data storage but designers are also using it to hold configuration data or code. In these latter cases, the flash must be programmed before it's used. Although flash chips are easier and faster to program than their predecessors, project teams are constantly pressured to cut the time spent in the production process. Using traditional methods, memory programming can take a lot of time. As storage densities for flash memory devices increase, programming times also increase, further magnifying this challenge. The traditional programming methods also provide little flexibility for last-minute design changes or programming updates while the product is in the field, both capabilities that are increasingly needed to add features or repair bugs. Programming methods must offer the flexibility to accommodate these updates.

Traditional programming
Three options for programming flash memory devices are commonly used today. One option, known as in-system programming (ISP), calls for programming the device after it's been installed on a printed circuit board (PCB). An engineer can first run a small program on an existing microprocessor somewhere on the same PCB and have the microprocessor program the flash memory device. The program itself can be stored in system memory somewhere or is accessed through in-circuit emulator hardware, either of which adds extra equipment and an additional manufacturing process step. Data transfer to the flash memory device is inefficient because the microprocessor must first access the data from the source, store it in any available RAM and then program the device.

In the second programming option, the engineer pre-programs the flash memory device before installing it onto the PCB, which increases the manufacturing cost because it requires extra programming fixtures. The device, once pre-programmed and mounted, can't be used for other applications as the design evolves. This option also doesn't allow for last-minute changes, enhancements, or bug fixes that may be necessary after the part is inserted onto the PCB.

The third option uses ISP with a Joint Test Action Group (JTAG) boundary scan chain to control the pins connected to the flash memory device. This option is currently used because many flash memories don't support the JTAG interface due to cost and space limitations. In this approach, the device is connected to a JTAG-compliant device on the PCB, which acts as a programming host, as shown in Figure 1.

View the full-size image

Other chips, such as an application-specific integrated circuit (ASIC) or a programmable logic device (PLD), may be used as the programming host. This method is inefficient, however, because it requires shifting hundreds of bits of data through the entire JTAG boundary scan chain in order to write just a few bits of data to the flash memory device. Another limitation to this approach, when using a PLD as the host, is that it requires the PLD to enter a programming mode. This means the core of the PLD, and probably other devices connected to it, will temporarily cease to function.

A faster technique
A parallel flash loader (PFL) combined with a complex programmable-logic device (CPLD) offers several benefits over existing flash-programming options. This fourth option takes advantage of real-time ISP, allowing for last-minute design changes, enhancements, or bug fixes without sacrificing development time. The PFL-with-CPLD approach also offers the flexibility to make updates in the field without powering down the entire system.

The PFL method provides a straightforward, cost-effective way to program flash memory devices through the JTAG interface. The PFL uses connections and equipment that are already present and common in the manufacturing process. The programming port is the JTAG test access port (TAP), which is the method consistent with JTAG testing and CPLD programming. The JTAG TAP is found on most PCBs because it only requires four pins to access all of the JTAG-compliant devices on the PCB.

This approach uses a CPLD to bridge the JTAG interface and the flash memory device's parallel address/data interface. Instead of shifting data through all of the pins of a JTAG-compliant device, this method quickly retrieves data from the JTAG scan chain and generates data that is formatted for the target flash memory device. Unlike the JTAG boundary scan chain method, the PFL brings the data through the logic array of the CPLD. The PFL combines a unique connection of the JTAG TAP state machine to the CPLD logic as shown in Figure 2.

View the full-size image

The JTAG state machine is controlled by the JTAG signals TCK, TDO, TDI, and TMS. A portion of the logic and routing in the CPLD is used to complete the interface from the JTAG port to the flash memory device. This logic is controlled by the signals from the JTAG state machine and drive the I/O connected to the external flash memory device. During this operation the JTAG state machine will decode instructions in the logic array to transfer data from the JTAG TDI input port to the flash memory device connected to the CPLD. Since the PFL is implemented in a programmable device, it can be adjusted to different memory standards for different vendors. Once the flash memory has been programmed, it can be overwritten with a final end application, or it can be a permanent part of the CPLD application because it only consumes a small portion of logic.

This method significantly reduces the flash memory device programming time. Using the example of programming a single vector into a 48-pin common flash interface (CFI) flash device, Table 1 shows the amount of time savings possible using the PFL solution. The example compares the PFL method to the traditional method of programming via the JTAG boundary scan chain using a JTAG-compatible PLD or ASIC that has about 200 pins.

In addition to its drastically shorter programming times, the PFL can send configuration or initialization data to the flash memory device for other FGPAs, ASSPs, microprocessors, or ASICs in the system. The remaining logic in the CPLD could then be used to implement functions to execute configuration signals to these devices.

Existing implementations
Off-the-shelf intellectual property (IP) cores are pre-engineered blocks of logic created for specific functions. Existing IP is commonly used in PLD designs because it doesn't require any design or verification effort. boundary scan IP companies and PLD vendors have developed various IP cores to implement the PFL solution.2

In one prepackaged solution, the design software contains the programming specifications for various CFI flash memory devices. Data and control signals for the flash memory device are sent to the PLD via the boundary scan chain. The PFL logic interprets the data and control signals and drives them to the flash memory device. The PFL data and control signals are shown in Figure 3. In order to send data to the flash memory device the software will convert a hexadecimal file (.hex) into a programming object file (.pof).

View the full-size image

A major advantage of the PFL is that it's implemented in programmable logic, thereby allowing design reuse. For instance, when a designer moves to another programmable logic device, he can reuse the PFL in the new device. He does not need to create a new design to implement the same functionality. As data requirements and flash memory capacities both grow, the PFL approach can still be used with little to no redesign effort. The PFL can easily be ported into new designs or the same design for different platforms.

The PFL method doesn't require special programming fixtures since the CPLD uses the JTAG scan chain connections already present on the PCB. This ability to use existing connections can reduce both manufacturing costs and time. Since the PFL uses only a small portion of logic in the CPLD, the remaining logic can be used for other applications, such as I/O expansion, system configuration, or power-up sequencing. Finally, the PFL function can fit into a small CPLD, resulting in a low component cost.

The upshot
Although the ISP with a JTAG boundary scan chain is often the method of choice for programming today's flash memory devices, this approach has limitations. It requires considerable data shifting through the entire JTAG boundary scan chain to write just a few bits of data. Engineers might consider using the PFL method to program flash memory devices through the JTAG interface because it offers many benefits, including shorter flash memory device programming times, ease of use, and low adoption cost.

Theresa Vu is a senior product marketing engineer for Altera Corp. She spent the last six years in the programmable logic industry. She can be reached at .

1. For a short description of these requirements, see my article “How to integrate flash device programming and reduce costs,” Programmable Logic DesignLine, August 2005,

2. Altera offers the Parallel Flash Loader (PFL) Megafunction in its Quartus II software. Intellitech offers the Fast Access Controller (FAC) intellectual property (IP).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.