Improving reliability of non-volatile memory systems

November 19, 2018

Daisuke.Nakata@Cypress-November 19, 2018

Figure 3 shows the flow of operation in a system employing internal wear leveling. As described earlier, wear leveling is triggered by a sector erase operation. It is important to note that for the vast majority of sector erases, the program/erase count for the sector to be erased will be below the threshold and a sector swap will not be initiated. Thus, only the standard erase procedure is performed (which ends at step 4). In the rare cases that swaps are required, the swap procedure is invoked.

click for larger image

Figure 3: The flow of operation in a system employing internal wear leveling, in this case, the EnduraFlex architecture implemented in Cypress’ Semper NOR Flash. (Source: Cypress Semiconductor)

Note that the mapping table is a logical to physical sector address map. The Validation Bit is actually three non-volatile flags that validate if the non-volatile operation has been successfully completed. The Data Valid Bit is a mask bit to mask the invalid data to all-0 in a sector being swapped.

Step 1: Logical Sector A is mapped to Physical Sector X, Logical Sector B is mapped to Physical Sector Y, and so on.

Step 2: The user sends an erase command to Logical Sector A.

Step 3: Erase Physical Sector X

Step 4: Check if Physical Sector X reaches threshold: is a swap required?

End process if no

Step 5: Find the swap candidate sector which has a minimum number of erase cycles (in this case, Physical Sector Y)

Step 6: Program Mapping Table in Flash. Note that the Mapping Table in RAM has not been updated yet.

Step 7: Program Validation Bit 1

Step 8: Copy data in Physical Sector Y to Sector X which has already been erased.

Step 9: Program Validation Bit 2

Step 10: Update Mapping Table in RAM.

Step 11: Erase Physical Sector Y

Step 12: Program Validation Bit 3

Step 13: Erase is complete

Now, the user completes erasing Logical Sector A which is mapped to Physical Sector Y (blank), where Y has fewer program/erase cycles than Physical Sector X. Logical Sector B is now mapped to Physical Sector X and stores the original data (i.e., Logical Sector B contains the same data as before the swap.) By repeating this sequence each erase cycle that triggers the execution of a sector swap, all sectors in the wear leveling pool will have a uniform erase cycling history throughout their life cycle.

Figure 4 shows a simulation where ~1.3M program/erase cycles are applied to a logical sector (128). The internal wear leveling function spreads the 1.3M cycles across 256 sectors, resulting in an average cycle count of 5089 per sector.

click for larger image

Figure 4 Simulation results of ~1.3M program/erase cycles spread across 256 sectors for an average cycle count of 5089 per sector. (Source: Cypress Semiconductor)

Note that the BER, data retention, and endurance are strongly related. The BER is expressed by an exponential correlation with the number of program/erase cycles. Thus, if the number of program/erase cycles is reduced by a power of ten, the BER will be improved by several orders of magnitude. While a complete reliability analysis exceeds the scope of the paper, it is apparent that wear leveling has a significant positive effect on NVM reliability. Internal wear leveling, where wear leveling is integrated into the memory device, makes this level of improved reliability available to any host system in an entirely transparent manner.

Power Failures

One of the most critical concerns and technical challenges when implementing reliable storage is robustness against power failures. Specifically, the NVM device relies on the mapping table to be correct. This raises the question of how to handle power failure situations that may occur while mapping information is being updated as an error could result in the compromise of the entire NVM device.

A power failure during a normal erase operation leaves the sector to be erased as “incompletely erased”. Flags associated with the erase operation indicate this state and prompt a re-erase after the next power-on cycle.

A power failure during wear leveling is more complicated since the physical erase operation is no longer limited to the sector the user attempted to erase. Now it involves an additional erase operation triggered by the wear leveling algorithm; see Figure 3 where physical sector Y is erased by the wear leveling algorithm and is not visible to the application. Thus, a power failure recovery routine needs to be a part of the wear leveling algorithm.

Assume in Figure 3 that power is lost at Step 11. When power is recovered, the device first reads the Mapping Table in the wear leveling address space and checks its validity to then reconstruct the Mapping Table in RAM. Each entry of the Mapping Table in Flash contains a Validation Bit (1,2,3). If power is lost at Step 11, then the Validation Bit of the interrupted swap may be (0,0,1). The Data Valid Bit of the erased sector (Sector Y) is set to “invalid” and then the swaps {A=Y, B=X} are recorded in the Mapping Table (RAM). Now logical sector A is mapped to physical sector Y but the erase of physical sector Y may still be incomplete. That is not a concern because the user attempted to erase logical sector A but that erase was interrupted and recorded as interrupted. After the next power up, sector A will need to be re-erased. Logical sector B is mapped to Physical Sector X and has its original data before wear leveling is initiated.

click for larger image

Figure 5: Power Up sequence needing to be added to the wear leveling algorithm to recover from a power failure during an erase cycle. (Source: Cypress Semiconductor)

The above power up sequence ensures that the wear leveling algorithm does not corrupt any user data. Nor does it require the application to implement any special software/hardware algorithm. Because wear leveling is implemented internally, the entire process is transparent to the application. This makes wear leveling extremely useful for high-reliability application such as automotive applications where meeting reliability targets can be challenging.

The next-generation NVM memory integrated with wear leveling is crucial for high-reliable industry. For example, Cypress Semper NOR Flash Memory combines advanced NVM technology with wear leveling and ECC to achieve over 1M endurance and 25 years data retention. 

Daisuke Nakata is Director of Systems Engineering for the Memory Products Division at Cypress Semiconductor. He works on systems architecture development to enable the next-generation of memory products. He holds a master’s degree of materials processing from Tohoku University, Japan.

< Previous
Page 2 of 2
Next >

Loading comments...