Take advantage of the non-volatile memory services in flash-based MCUs - Embedded.com

Take advantage of the non-volatile memory services in flash-based MCUs

Most applications that require a microcontroller also require somemechanism to store settings that will be remembered even when power isremoved. For example, a radio that forgets its station presets when thebattery is changed isn't going to be particularly successful in themarket. Customers expect that favorite stations, temperature presets,preferences and other persistent information will be saved from one useto the next.

To meet this customer expectation, designers traditionally use aserial EEPROM. These devicesare small, inexpensive and have a long history that makes the designengineer comfortable with their use. But in today's cost-sensitivemarketplace, even an inexpensive device may break the budget. It istempting, therefore, to seek a resource already included on themicrocontroller chip itself: leftover space in the program flash.

In the past, many microcontrollers would use ROM orultraviolet-erasable EPROM to store program instructions. Butincreasingly, microcontrollers turn to flash technologies for codestorage. The main reason for choosing flash memory is that it can beerased and updated if errors are found in the program code.

Most microcontrollers provide a mechanism to retrieve data stored inprogram space. Processors based on the VonNeumann architecture, such as the TI MSP430, can use any addressingmode to read program flash. Harvardarchitecture processors generally provide special mechanisms totransfer data from program space to data space. Other MCUs with Flash management capabilities include:

1) the popular 8051processor family which includes the MOVEC (move constant) instruction,

2) Microchip's PIC18 familywhich includes the TBLRD and TBLWR (table read and table write)instructions, and

3) the Maxim MAXQmicrocontroller family which features a pseudo-Von Neumann architectureto permit access to flash program store with a simple MOVE instruction(Figure 1, below).

Figure1. A traditional Harvard machine has separate code and data pipes sothat operands and instructions can be fetched in parallel, but requiresspecial instructions to use information stored in code space as data.In a pseudo Von Neumann machine such as the Maxim MAXQ2000, a dataswitch allows any memory block to be used as either code or data whilemaintaining separate code and data pipes.

But even with the ability to read data from random flash locations,a complete non-volatile memory subsystem must also be able to randomlymodify data in flash. And that means resolving two problems: first,once a flash location is written, data in that location can only bechanged by erasing the entire flash block, often 128 bytes or more; andsecond, flash memory can endure only a limited number of write/erasecycles.

This article will demonstrate how to construct a mechanism thatsolves these problems and emulates random writes using a block of flashmemory. The examples presented are for a MAX2000, but the principlesare applicable to any processor that supports the ability of user codeto read, write and erase flash memory.

The schemes described here are used in a three-phase electricitymeter reference design based around the MAXQ3180 three-phase analogfront end and the MAXQ2000.

The basics of flash management
Flash memory isa type of electrically-erasable memory. It is generally considered tobe “read-mostly;” that is, while writable, it is expected that dataupdates will be infrequent, and that the bulk of operations to thememory will be read operations. Most flash memory devices are writableat the word level, but can only be erased an entire block at a time.This makes them generally unsuitable for variable storage, and suitableonly for constant data tables that never change.

There are two kinds of flash memory: NANDflash and NOR flash.NAND flash is the variety found in memory cards and flash drives. Ingeneral, reading from a NAND device requires several cycles, and mostoften is accomplished in a serial manner.

This makes it unsuitable for storage of program code, since theaccess time would be much too long. NOR flash, on the other hand,resembles conventional byte- or word-wide memory. NOR flash can be readas one would read a ROM device: assert a device select and an address,and wait the access time before reading the data from the bus.

Flash memory blocks generally erase to a “1” state; consequently,after erasing, every location in the block will contain 0xFFFF.”Programming” a flash location changes some bits from a “1” state to a”0″ state. To return a programmed bit back to a “1” state, the entireblock must be erased.

An issue that must be confronted with any electrically-erasablememory device is that of endurance. Depending on the specifics of thetechnology, a flash memory cell can endure as few as 1,000 to as manyas 1,000,000 erase-program cycles before permanently failing.

Any scheme that uses flash memory for data storage must ensure thatwrite cycles are evenly distributed about the array, and that no oneset of locations is erased and programmed more often than another.

Most flash devices will permit any unprogrammed bit in a previously-programmed location to be changed from a “1” to a “0” state. Forexample, most devices would allow a location programmed with the value0xFFFE to be subsequently programmed with the value 0x7FFE, since itdoes not change any bit value from “0” to “1”. The flash used in theprocessor architecture shown in Figure 1, however, does not permitthis. Such a write attempt would fail, leaving the value at 0xFFFE.

The reason for this is a good one: since the memory block that isbeing programmed is primarily used as code space, it is prudent toforbid any write operation to a location that has been previouslywritten. Since the instruction 0xFFFF specifies an invalid sourcesub-decode, it is unlikely to appear in a valid code block. Thus,blocking writes to a previously programmed location helps preserve theintegrity of the code blocks.

Providing non-volatile memoryservices
Following are two schemes for providing non-volatile memory services.The first scheme emphasizes simplicity, while the second scheme is moreflexible at the expense of somewhat higher complexity.

Scheme 1
Problem: Configuration data, such as calibration information, MAC addresses ormanufacturing data must be stored in a product. While this is generallyfixed information, the possibility exists that the configuration datamay need to be updated several times over the lifetime of the product.

Solution: Thisis actually the simplest conceivable case. Two blocks – one at wordaddress 0x7E00 and the other at 0x7F00 – are reserved for data storage.The first time that a command is received to save the configurationdata, the processor checks both blocks and discovers that they are bothblank. The configuration data is saved to the first block.

The second command to save configuration data causes the processorto once again check both blocks. When it finds block 0 in use, itcopies the configuration data to block 1, and then erases block 0.

When a request is received to restore configuration (on power-up,for example) the processsor reads both blocks to determine which is inuse. Whichever block is not erased is used.

The main advantage of this scheme is its simplicity: if theapplication requires a block configuration at power-up (or otherconfiguration-restore events) this works well. The read routine acceptsa word pointer and returns the value at that address, and the writeroutine accepts a word pointer and attempts a write cycle at thataddress. The erase routine just erases both blocks.

The primary disadvantage of this scheme is the reverse side of itsprimary advantage: the routines are extremely simple-minded. No attemptis made to determine if a write will succeed — the write is attempted,and if it fails, nothing is done to try to fix the problem. That is whythe scheme is only used for writes to what is known to be a blank flashblock.

Scheme 2
Problem: Non-volatilestorage is required to keep track of energy usage and other oftenvariable data. Updates occur from several times a week to several timesa day.

Solution: Thisis a situation in which even conventional EEPROM would need help. Theproblem is this: the frequency of updates and the fact that allnon-volatile memory have limited write lifetimes preclude writing anderasing a single EEPROM cell over and over. Consider the case of anupdate once per hour. An EEPROM with a 10,000 write-erase cycle limitwould be exhausted in just over a year — well short of the ten-yeardesign goal established for the electricity meter.

A solution to this problem is to implement some form of “wearleveling”. This means that no single location is written over and over.Instead, write cycles are spread over the entire memory array withappropriate indexing similarly distributed.

Wear leveling is a well-understood technology and is used in flashstorage devices for just this purpose. The algorithms can be complexand difficult to understand. But for our purposes, a much simplermechanism will suffice.

Instead of being referenced by address, data items in the storagearray are referenced by data element number.

A data element number is an arbitrary eight-bit value that uniquelyidentifies the data element; thus, in this scheme, there can be amaximum of 255 data elements (data element 0 is reserved.)

Each data element has a two-byte header (see Figure 2 below ) that contains thedata element number and the data element length — a two-bit code thatindicates one, two, three or four sixteen-bit words — with sufficientspace left over for error management, if desired.

Figure2: Structure of a Data Element Header

Writing a data element requires the address of the data to bewritten, the data element number to be written, and the length. Thewrite function seeks the end of the array, and then writes the new dataelement immediately following the last record.

If there is insufficient space in the flash page to contain a recordof the specified length, an end of page marker is written and a newpage is opened. See Figure 3 below for the structure of a typical data page.

In the illustrated data page, data element 1 was written first, andis frequently updated. Data element 4 was written next, and has neverbeen updated. Data element 3 was then written, and has been updatedseven times. Finally, data element 2 has been written and neverupdated.

The presence of a page end marker indicates that a data writeattempt was made, but the data element was too long to fit on the page.A new page would have been opened to accommodate the data element. Theend of the entire data structure is designated by a blank element wherean element header is expected to be.

Figure3: A typical data page

Note that I have not addressed the issue of duplicate records.That's because, in this scheme, duplicate records are not an issue. Infact, duplicate records are completely ignored by both the read andwrite routines.

When writing, the new record goes at the end of the array withoutregard to whether or not a same-numbered record already exists. Whenreading, only the last — and thus, most recent — record that matchesthe requested record number is retrieved.

Reading a data element from the array is a little more involved thanwriting. The read function accepts an element number and an address towhich the contents of the data element should be written. When called,the read function searches the array from the beginning.

When it finds a record that matches the requested data element, itstores away the address and keeps searching. If it finds anothermatching record, it replaces the stored address with the new address.

By the time the end of the array is reached, the stored address willpoint to the most recently written copy of the requested record. Theread function then copies this data to the buffer passed when thefunction was called.

Reusing memory space
Now, we have a workable read-mostly mechanism to store and retrieverecords from a storage array. There's just one problem: we haven'tbuilt any mechanism to reuse space occupied by obsolete copies of therecords. (We also haven't built a mechanism to delete a record, butsince this is structured to work in an embedded application that'sprobably not a critical feature.)

Without recovering space, the allocated space will quickly beexhausted. Recovering space means erasing entire pages, since flashmemory can only be erased one full page at a time. But flash pagescan't be arbitrarily erased without running the risk of deleting usefulinformation. The only option is to copy active information to a newpage before erasing old pages.

Recovering space from obsolete records is a three-stage process.First, new flash pages are opened and the most recent versions of eachdata element are copied into the new pages. Second, the old pages areerased. Finally, page markers are placed on the new pages so that theread routine can find them.

The first stage is the tricky part, so we'll look at that in moredetail. A simple way to perform this step would be to divide it intotwo substeps: first, use a RAM array to store the record number and theaddress of the most recent record in the array; and second, stepthrough the RAM array to copy the most recent records to the new flashpages. This would be fast and relatively painless.

The problem with that scheme is that the processor used (see Figure1) has 1K word of RAM. Such a scheme would limit the amount of uniquedata that could be stored to the size of the RAM that could be sparedfor the buffer. This is clearly unacceptable.

The solution is time-consuming but works no matter how large (withinreason) the storage array becomes. Instead of building a pointer listin RAM, make multiple passes through the source array for every uniqueentry. Thus, the compacting algorithm becomes:

1) Read an element from thesource array.
2) Search for the element inthe destination array. If found, that element has already been written.Increment the source pointer and go back to (1).
3) Scan the source array forthe latest copy of the element.
4) Write latest copy of thedata element to the destination array.
5) Increment the sourcepointer and go back to (1).

At the end, there will be exactly one entry for each unique dataelement in the destination array. An illustration of a packed page isgiven in Figure 4 below. The source pages can now be safely erased, andpage headers written to the destination array.

Figure4: The data page of Fig. 2 following space recovery

The process has been structured to be as safe for the stored data aspossible. A danger that must be confronted when working with flashdevices is that of power failure during a write or erase operation.

If power fails, it is likely that one or more pages will becorrupted (in the case of a write) or incompletely erased (in the caseof an erase operation). But the compact operation is inherently safe.Consider:

* If a power failure occurs during the packing operation, the sourcepages are completely intact. After power restoration, the newly writtenpages can be easily identified (they have no page headers) and erased,and the pack operation restarted.

* If a power failure occurs while the old pages are being erased,they will probably contain invalid headers. These pages can be erasedand headers added to the new pages.

*If a power failure occurs when page headers are being written tothe new pages, the data is intact. The page header update operation canbe restarted.

In short, there should be no mechanism through which unexpectedevents can irrecoverably corrupt array data.

Enhancements
As presented, the storage subsystem has no error detection mechanism.There are bits available — six, as presented — in the data elementidentifier that are uncommitted.

One could use a CRC6 algorithm (x6 + x + 1) to compute aCRC across the data element to ensure no read or write errors haveoccurred. While this is not a particularly robust algorithm (it willmiss one out of 64 multi-bit errors) it will detect most real-worlderrors likely to occur.

Another limitation to the system as given is that read access timeis necessarily long. For every read, every record in the array must beread to find the most recent record. There are three methods that couldbe used to improve access time:

1) Leave a blank word inthe data element for a forward pointer. When the value is updated, fillin the forward pointer to the new entry. In this way, the table can betraversed as a linked list.

2) Traverse the tablebackward. Now, you can simply stop at the first occurrence of therequested element.

3) If there is only a smallnumber of unique elements, create a RAM array at startup that containselement IDs and pointers. Subsequent accesses are very fast — justread the RAM array to find out where to obtain the data element.

Ben Smith is an engineering manager for Maxim Integrated Products. Overhis thirty years in the industry, Ben has developed products for theindustrial, communications and consumer markets. His team currentlydevelops applications for Maxim's MAXQ line of microcontroller products.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.