Smaller, faster, cheaper. Repeat. This has been the mantra of the semiconductor industry for more than 50 years, ever since Douglas Engelbart, a computer engineer, first introduced the elegant but radical idea of “scaling” to the electronics industry in 1960. Speaking at the first International Solid-State Circuits Conference, Engelbart’s idea was deceptively simple: as you make electronic circuitry smaller, components will get faster, less expensive and less power hungry.
The corollary to scale is integration. As individual transistors and other components get smaller, more of them can be combined – integrated – on a single chip. From discrete logic functions to microprocessors to microcontrollers to systems on chip, the industry has progressed to higher and higher degrees of integration as it has learned to make transistors and other components smaller and smaller.
But not all things have been able to shrink at the same rate. Most notably, memory technologies have not been able to scale as fast as other logic circuitry. This has resulted in a divergence between these two critical segments of the semiconductor industry, with different manufacturing processes for each that require separate manufacturing foundries on two distinct technology paths. And with memory and logic being manufactured on different process nodes, they cannot be integrated on the same chip.
Now engineers are looking up, literally, and moving into the third dimension as they seek to achieve even greater levels of integration. Moving from 2D planar topologies to 3D implementations offers many desirable benefits, including smaller overall footprint and shorter average interconnection length, along with associated improvements in cost, latency and energy consumption. But with logic and memory unable to reside on the same chip, engineers have been forced to consider a number of alternative methods for stacking individual chips and connecting them, including through silicon vias (TSVs) and silicon interposers, to achieve their integration goals. But this kind of stacking is also plagued by temperature, yield and other issues.
Enter nonvolatile resistive RAM (RRAM) technology. RRAM can be manufactured with the same CMOS process node in the same manufacturing foundry used today by logic circuitry. This manufacturing convergence, together with other unique features of RRAM technology, is making 3D monolithic integration of massive amounts of storage-class memory and logic circuitry possible for the first time. By eliminating expensive interposer and other chip-to-chip and stacked chip integration techniques and their limitations, RRAM technology enables system-on-chip architectures to benefit from the superior characteristics of embedded storage in the next generation computing platforms, creating new markets and new opportunities for monolithic embedded storage.
Since its introduction in 1987 by Dr. Fujio Masuoka from Toshiba, NAND flash technology has achieved the largest share of the nonvolatile memory (NVM) market. Today, NAND flash is the preferred mass storage media across a broad range of consumer applications, including tablets, smart phones, media players, cameras and video recorders, as well as in solid state drives (SSD) to boost the performance of computers, from slim laptops to massive enterprise storage systems.
But this 30-year old technology has recently begun showing its age. It is now widely accepted that scaling NAND flash below 25nm has significantly degraded performance and reliability. For example, scaling from 72nm to 16nm has shown an increase of the raw bit error rate (BER) from 1e-7 to 1e-2 and a decrease of cycling from 10,000 cycles to below 3,000 cycles. This, in turn, demands increased overhead and computational power from the NAND controller logic and the host system to compensate.
NAND flash is hitting this technology wall just as emerging storage applications are demand higher reliability and endurance. Current 2D planar NAND flash technology has reached a limit that will not be able to serve these applications.
This widening gap between application requirements and flash technology has spurred industry professionals to look into new solutions, including various 3D approaches, to resolve these issues while boosting density. While these efforts are commendable and may achieve some near-term success, NAND flash will always remain a separate component of any system, and designers will still need to contend with all that implies, from bandwidth and latency limitations to costs and burdensome power requirements and more.
The industry needs a fresh approach that not only addresses the scaling and reliability issues confronting NAND flash, but also the larger issues of achieving higher levels of integration between logic and memory. And this will be possible only when these two manufacturing process nodes converge again.
RRAM is widely hailed as the most promising technology in the race to develop new, more scalable, high-capacity, high-performance and reliable storage solutions. RRAM technology is based on a simple two-terminal device structure integrated in back-end-of-line (BEOL) process. RRAM cells typically employ a switching material sandwiched between two metallic electrodes that can exhibit different resistance characteristics when a voltage is applied across it. Significant performance differences can be achieved depending on the switching materials and memory cell organization chosen. When the RRAM technology uses CMOS-friendly materials and standard CMOS manufacturing processes, multiple layers of cross-point RRAM arrays can be integrated on top of CMOS logic wafers to build SoCs and other chips with large amounts of 3D monolithic embedded RRAM storage.
Regardless of the material specifics, developers of RRAM technology all face several common challenges: overcoming temperature sensitivity, integrating with standard CMOS technology and manufacturing processes and limiting the effects of sneak path currents, which would otherwise disrupt the stability of the data contained in each memory cell.
RRAM technology employs filamentary nanoparticles and simple CMOS compatible materials like non-conductive amorphous silicon (a-Si) as the switching material. As it is a low temperature back-end-of-line (BEOL) process integration, it can be easily integrated with CMOS logic circuitry and manufactured using existing CMOS fabs without the need for any special equipment or materials. When an electric field is applied across the cell, a metallic filament forms across the cell and changes its resistive characteristics. Because the switching mechanism is based on an electric field, the cell behavior is very stable across a wide temperature range.
From a cell perspective, resistive memory cells should boast the same on-current even as the device area is scaled down, but have reduced off-currents. On/off current ratios, from a few hundred to more than a thousand, are typical. This also improves the sensing margin, enabling sensing with less complicated CMOS peripheral circuitry and the ability to do multi-level cells and triple-level cells (MLC/TLC) at smaller technology nodes.
One of the greatest challenges facing developers in achieving ultra-high density RRAM (>1Tb) has been overcoming the leakage (sneak) current problem in cross-point arrays that interferes with the reliable reading of data from individual memory cells. More advanced technologies such as Crossbar’s offer a built-in select feature able to overcome the sneak current issue that has plagued other RRAM approaches. This opens the door to the use of architectures with the ability to employ a single transistor to drive as many as 2000 resistive memory cells with very low power. Crossbar’s field assisted superlinear threshold (FAST) selector device is capable of suppressing the leakage current below 0.1nA, and has been successfully demonstrated in a 4-Mbit integrated 3D stackable passive array.
The high selectivity of the FAST device and its ability to be integrated directly into each RRAM memory cell make it possible to move beyond the density limitations of 1T1R array structures and implement commercial memory products based on 3D stackable 1TnR memory architectures for ultra-high density nonvolatile memory application. The typical primary benefits of higher levels of integration are higher performance, lower energy consumption and lower cost
Compared to NAND flash, RRAM technology delivers 100X lower read latency and 20X faster write performance. Unlike flash, RRAM technology can be architected with small pages that can be independently erased or re-programmed. This new storage architecture simplifies drastically the complexity of the storage controller by removing a large portion of the background memory accesses required for garbage collection. With a write amplification equal to 1, RRAM achieves visible benefits in terms of read and write latencies, lower energy consumption and increased lifetime of the storage solutions.
Higher Performance – On-chip storage eliminates the need for a complex controller to talk to external memory and enables the use of wide memory buses that break the bandwidth bottleneck between computing cores and RRAM storage arrays. Different kinds of storage elements can be integrated for each kind of memory usage, and each processing core can have its own dedicated on-chip memory and storage arrays. By eliminating connectors and pads, external buses, output buffers, re-synchronization and such, greatly reduced latencies are achieved at few hundreds of nanoseconds. And the latencies are more predictable due to better data management and reduced background operations resulting from the simplified controller. On a system level, hot data can be now stored very close to the processing cores in on-chip persistent-memory. This enables new architectures and data management tiering.
Lower Power – At the memory cell level, RRAM can improve programming performance and power consumption by achieving a 64pJ/cell program energy, a 20X improvement compared to NAND flash technology. On a system level, having storage memory on-chip reduces energy consumption even further by reducing or eliminating accesses to external memory, reducing power consumption through fewer I/O operations. Lower, more predictable latencies also reduce power consumption by shortening the execution time of code fetching or data streaming.
Higher Security – With no external bus required for the processor to access storage, systems are less susceptible to direct data interceptions through physical attachment and to a variety of side channel attacks that analyze measurements of electromagnetic emanations. Everything remains on-chip and not visible outside the package.
Lower Cost – Monolithic integration of storage memory offers many cost benefits by eliminating the need for a variety of mechanical connectivity components and methods of varying complexities that can lower yields and increase overall fabrication cost. Multi-chip modules, across a board or substrate or using package-level integration, and wafer-level integration using TSVs and interposers generally require tradeoffs between cost, performance, functionality and footprint.
At $60 billion annually, the memory market is huge, but until now it has existed in a separate domain from most of the rest of the semiconductor industry, and the NAND flash portion of the market has been dominated by about six major vendors and fabs. Entrenched legacy technologies like NAND won’t be displaced overnight. It takes time to build a strong ecosystem of new-generation memory technology providers, partners and strategic alliances to nurture this kind of innovation and bring it to end users profitably. Creating new opportunities, differentiated applications with monolithic embedded RRAM storage is what innovative companies are currently building within an ecosystem of partners. Successful 3D RRAM technologies will have to validate smooth integration in CMOS pure-play manufacturing foundries, overcome temperature sensitivity and solve the sneak path currents challenge of cross-point arrays.