The Future of Embedded Flash
Lee Cleveland, Vice President, Engineering, Kilopass
Differentiation and time-to-market have always been the
cornerstones of success for chip companies. Embedded Flash
has played an important role in enabling competitive products
by providing flexibility and a higher level of integration to the system-on-chip (SOC). At 90nm and above, embedded Flash is prevalent
in many applications, ranging from code storage in power window
controllers to secure key storage in smartcards. But at 65nm and
below, there is currently no viable embedded Flash solution available.
A solution will be needed as products migrate to more advanced
process geometries in the next few years. In this article, possible non-volatile
memory (NVM) technologies to enable embedded Flash
at 65nm and below will be explored, including magnetoresistive
random access memory (MRAM), silicon-oxide-nitride-oxide-silicon
(SONOS), resistive random access memory (RRAM) and antifuse.
The traditional embedded Flash based on floating gate technology
is not economically scalable below 90nm. New array architectures
and process modifications will be required because the decreased
oxide thickness in advanced technologies does not favor data
retention in charge-trapping technologies such as floating gate.
Process modifications may take years to optimize and may not be
economically feasible in the more advanced processes. To augment
the void in advanced process technologies, external serial Flash
solutions have increased in popularity; their low pin count and
small packaging can provide designers an alternative for code storage
needs. The tradeoffs are the increased bill of materials (BOM) for the
system, additional components such as power supplies to support
the serial Flash, and reduced performance due to the relatively
slow serial peripheral interface (SPI). For mobility products such
as portable media players (PMP) or smartphones, where the form
factor is shrinking and functionality is increasing, integration of
the external serial Flash is beneficial for cost and performance. An
embedded NVM solution is needed at 65nm and below. MRAM,
SONOS, RRAM and antifuse are possible contenders. Cell size,
density, scalability, read performance, program performance, array
efficiency, added cost, endurance, erase time, data retention and
high-volume mass production status are benchmarked in Table 1.
Table 1. Comparing MRAM, SONOS, RRAM and Antifuse Technologies
| |
MRAM
(Spin Torque) |
SONOS |
RRAM |
Antifuse |
| Cell Size (F^2) |
10-30 |
10-15 |
6-20 |
10-70 |
| Density |
Up to 256Mb |
Up to 2Mb |
Up to 64 Mb |
Up to 20Mb |
| Scalable |
Limited, Write
Current Increases
with Scaling |
Up to 40nm |
Yes |
Yes |
| Read
Performance
(MHz) |
75-125 |
20-50 |
20-100 |
15-100 |
| Program
Performance |
Fast |
Medium |
Medium |
Medium |
| Array Efficiency |
40%-60% |
30%-50% |
40%-60% |
40%-60% |
| Added Cost |
At Least +3 Mask
Layers |
At Least +8
Mask Layers |
At Least +3
Mask Layers |
No Additional
Mask |
| Endurance |
>10K |
10K |
<1000 |
<1000
(Emulated) |
| Erase Time |
Fast |
Medium |
Medium |
Medium |
| Data Retention |
>10 Years |
>10 Years |
<1 Year |
>10 Years |
| High-volume
Mass Production |
5-10 Years Away |
Now |
5-10 Years
Away |
Now |
MRAM
MRAM has been under development since the 1990s. It is one of
the earliest "universal memory" candidates. It is non-volatile and
was touted to have the capacity and cost benefit of DRAM and
the fast programming and random access of SRAM. The first-generation
MRAM technology had technical challenges, including
a program disturb mechanism that prevented manufacturability
for mass production. The MRAM bit cell consists of one magnetic
tunnel junction (MTJ) and a read transistor. The MTJ consists of
two ferromagnetic plates, each of which can hold a magnetic field,
separated by a thin insulating layer. One plate is magnetized to a fixed
polarity; the other can change through an external field to determine
the state of the bit cell. A strong electrical current is driven along the
plates to produce the magnetic field to program the MRAM. The
change in magnetic polarity is sensed as a resistance change in the
tunnel junction. The problem is that all of the neighboring storage
elements are exposed to the same fields, resulting in unwanted
programming. In addition to stability issues, the first-generation
MRAM was not scalable. The thermal stability of the MTJ degrades
as the spatial volume decreases.
Given the stability and scalability issues, most MRAM developers
abandoned the first-generation technology. The leading second-generation
technology is spin torque, also known as spin-transfer
torque MRAM (STT MRAM). STT MRAM is programmed with
an electrical current, not an applied magnetic field as was used in
the first-generation MRAM. The electrical current must be spin
polarized which is achieved by passing the current through a thin
magnetic film or polarizer that is added onto one of the MTJ's
ferromagnetic plates. Data programming is performed by using the
spin-polarized current to change the magnetic orientation of the
storage element. Program disturb effects are minimized since writing
is isolated to the selected MTJ device. The difference in the resistance
of the MTJ device determines the state of the memory. The program
speed is fast in STT MRAM, but the endurance is not as good as in
the first-generation MRAM because there is a wear out mechanism; a
thin tunneling layer is created during the spin torque magnetization
when current is passed through the cell. Nevertheless, STT MRAM
is promising with benefits of small cell size and high performance.
The challenges facing deployment of STT MRAM include
proving the stability of the technology for mass production and
managing the cost adder in addition to potential multi-site wafer
handling. Embedded MRAM requires additional process steps,
including depositing the polarized material. This is done in another
facility because of fear of contamination to the logic process. For
mass production, having multi-site processing may be a logistical
nightmare. At least another seven to 10 years is ahead in development
and innovation of STT MRAM to bring it to mass market.
SONOS
NVM technology based on the SONOS structure was introduced
in the 1970s. It is a close relative to floating gate technology. Instead
of charge trapping in the polysilicon layer, charge is trapped in the
silicon nitride layer. There are N-channel based and P-channel based
SONOS devices. The bit cell is formed from a standard polysilicon
NMOS and PMOS transistor with the addition of a layer of silicon
nitride added inside the transistor's gate oxide; this is the ONO stack.
The nitride is non-conductive but can trap and hold electrostatic
charge. The nitride is electrically isolated from the surrounding
transistors. Programming is achieved through channel hot electron
injection or channel hot hole-induced hot electron injection or
Fowler Nordheim (FN) electron tunneling or source side injection
depending on the SONOS cell structure. Erase is achieved with FN
electron tunneling or band-to-band injection of holes.
SONOS technology has made advancement resulting from the
refining of the ONO stack. The ONO thickness has decreased
dramatically in the last three decades, from 500 angstroms to
around 100 angstroms. With the resultant decreased thickness, the
programming voltage has decreased significantly from 30V to about
10V. With decrease bit cell size and programming voltage, higher
capacity is possible. Discrete SONOS solutions will be more prevalent
in addressing the short falls of floating gate scalability for NOR and
NAND Flash. SONOS solved the charge loss issue seen in floating
gate technologies but has the opposite problem of having strongly
trapped electrons in the ONO stack that cannot be removed. Over
time, the trapped electrons may not be able to be removed, creating a
permanent programmed bit.
The relatively high programming voltage requires the addition of
high-voltage devices to the standard CMOS process, increasing the
area and, more importantly, adding extra mask layers and processing
steps that make it a potentially expensive solution for embedded
NVM.
RRAM
RRAM has thrown its hat in the ring as a "universal memory"
candidate. Of all the candidates, RRAM is the newest. Research
started in 2000 when it was discovered that an electrical pulse can
induce resistive change in thin film materials, including chalcogenide
glasses, perovskites and silicon dioxide. There are numerous theories
proposed for the memory mechanism in RRAM. Examples include
the alignment of oxygen vacancies in oxides, oxidation and reduction
in perovskites, and the formation of metal filaments in chalcogenide
glasses. RRAMs come in both unipolar and bipolar varieties. The
unipolar device can be both programmed and erased with the same
voltage polarity, while bipolar devices require opposite polarities to
program and erase.
The advantages of RRAM are its fast access time (<10ns), low
programming current (~10uA), and numerous materials that
show hysteresis (memory) behavior. With these advantages, many
companies from around the world have jumped onto the RRAM
bandwagon. But initial findings on many of the RRAM types have
shown poor endurance (<1,000 cycles), poor retention (<1 year),
and a limited understanding of the mechanisms involved. Even the
limited reports of better results have been on single cells or very
small arrays, so the biggest challenge will be to prove RRAM in mass
production. There may be another five to 10 years of development
ahead before RRAMs could go into widespread commercial usage.
Antifuse
Antifuse has been around as long as SONOS. An antifuse is opposite
of an electrical fuse. In the case of antifuse, the circuit is initially
shorted (low resistance), and the application of electrical stress causes
it to open. Antifuse NVM has been implemented for decades, and
since 2001, it can be implemented in a standard CMOS process
without additional processing steps. This technology is being used in
products in mass production.
Figure 1. Antifuse Bit Cell with a Program Transistor and a Select Transistor

A hard gate oxide breakdown is used as the one-time
programmable (OTP) NVM mechanism. The breakdown is achieved by applying a high voltage on the word line program (WLP) gate as
shown in Figure 1. Before the breakdown, between the gate and the
source of the program transistor, it is isolated like a capacitor. After
the breakdown, it behaves like a resistor between the gate and the
source. The program transistor is isolated from the word line read
(WLR) select transistor. Both the program and read transistors are
implemented using core devices; so as the technology scales, the bit
cell scales.
Of all the technologies discussed in this article, antifuse is the
only OTP NVM. It appears to be an unlikely contender of future
embedded Flash solutions since it has an endurance of one. However,
for many applications requiring code storage, less than 1,000 cycles
of endurances is all that is required. Antifuse is a possible solution
because it has two significant characteristics going for it that will
enable endurances of less than 1,000 cycles without the area penalty:
technology and cost scaling (Figure 2). Antifuse does not have any
limitations to be embedded in bulk silicon, silicon-on-insulator
(SOI) or silicon germanium (SiGe) medium. Oxide breakdown
can occur in silicon dioxide from the 180nm node and below, or
in high-K metal gate (HKMG) from the 32/28nm node and below.
Cost, in terms of area and programming speed, is decreasing with
each subsequent process generation. The memory array consists of
core devices, so it scales with the process technology. The thickness of
the oxide is decreasing as process scales. As a result, the programming
speed increases.
Figure 2. Scaling Trend for Antifuse Memory Array and
Programming Voltage

Given the technology is proven, and the scaling trend enables
higher capacity memories, the deployment of antifuse is now. The
challenge facing antifuse technologies is to achieve greater adoption
for code storage usage compared to traditional usages such as security
keys and configuration. Because of the high endurance of embedded
Flash and discrete serial Flash technologies, software developers may
need to alter their coding style to utilize antifuse efficiently.
Conclusion
At 65nm and below, the future of embedded Flash-like technology is
bright. There are several compelling technologies in development or
deployment. Some, including STT MRAM and RRAM, are many
years away; but if they can come to market with a stable and cost-effective solution, it could create a paradigm shift in the way volatile
memory and NVM is used. For now, however, SONOS and antifuse
are the only viable solutions for embedded Flash applications. The
overall cost of SONOS is higher than antifuse due to process adders
and a potentially larger area due to a higher programming voltage
being required. But SONOS has the traditional endurance of
today's Flash technologies. Antifuse is an OTP but has higher array
efficiency and a more compact area than SONOS. It could be used
for applications requiring <1,000 cycles of endurance.
About the Author
Lee Cleveland has 26 years of experience in the semiconductor industry
leading cutting-edge NVM development and high-valued analog IC
design. Most recently, Lee was the senior vice president of engineering
at Sipex, now part of Exar. Prior to Sipex/Exar, Lee held engineering
management positions at AMD and Spansion where he was directly
responsible for several generations of platform products, earning 61 patents
in algorithms, circuits, processes and testing techniques. He received his
B.S. in electrical engineering at University of California, Berkeley. You
can reach Lee Cleveland at leec@kilopass.com.
Back to Articles Home