For the past few years, the most popular topics in the 2.5D space have been:
- The design tools
- Foundry processes for through-silicon vias, temporary bonding and bump architecture
- The assembly process, such as what is first bonded to what
The industry is at the point where the open variables on these topics are narrowing, and other critical aspects need to get far better attention. The new product introduction (NPI) process for heterogeneous 2.5D devices opens up a lot of possibilities but also creates many complications that need to be considered early in the process.
The focus now needs to shift towards:
- Fundamental architecture differences
- Changes in our physical design process
- Design for test (DFT) strategy changes
- Testing at both the wafer sort and final test stages
- Challenges in characterization
- … and the gorilla in the room, yield management
In order to extract all the nectar from a 2.5D implementation, the process has to begin at the architecture phase. Taking an existing die meant for a conventional 2D package and adapting it is a losing proposition.
Tapeout costs for emerging foundry nodes are increasing exponentially. Therefore, taping out several chips in legacy nodes may be less expensive than the wafer node needed for a monolithic chip.
The key architecture differences are:
- The use of low power and highly parallel die-die interfaces
- Ability to use very fine pitch connections
- Additional interfaces that would have been used for blocks within the monolithic die but are now going to an external die
- Ability to mix wafer processes
- Voltage regulators are silicon-based in a power process and can be inserted in the package, making power management much easier
- Massive amounts of high-bandwidth memory in a DRAM process that can be partitioned differently for more efficient data management
- Analog functions that can be implemented in the legacy nodes in which they are best characterized
- Logic may not need the more advanced foundry nodes
- Plan for reuse
- Given that taping out several legacy node devices meant for 2.5D implementation may be less expensive than a tapeout in the latest node, plan to magnify this savings by architecting the chips to be used on other projects as well
- Reuse is not just for intellectual property (IP) anymore. The tiles (die used in 2.5D architectures) can be reused and this needs to be part of the plan. Think of blocks used for IP as potential tiles
- Power savings
- Without the need to drive long distances, power may be reduced by reducing the voltage and the need for advanced signaling like equalization or pre-emphasis
- Leverage the ability to have far greater capacitance adjacent to the tiles to reduce the aggressive clocking and resultant power consumption
- Achieve cost savings at the system level. While 2.5D packages are more expensive, they may actually reduce costs at the system level. For example, having all of the memory accessible internal to the package may eliminate large 256- or 512-bit memory interfaces that drive up the printed circuit board (PCB) layer count. Today, networking boards can have upwards of 36 layers with advanced via structures driven mostly by memory interfaces. Using 2.5D packages can significantly reduce board count using standard PCB vias.
With considerably different architectures, comes a different way to view physical design. Hierarchical design methodology can now be considered to be at the tile level. In this methodology, there is not much of a difference in the process or the tools used.
The interfaces on the custom die must face the tiles they are interfacing with. The memory interfaces may be quite large so this may end up making the die more naturally rectangular instead of the more common square shape h of current monolithic ASICs.
The new interfaces will initially be the largest risk until they are proven. There will be many new interfaces, but once the library of IP is developed better usage is expected. These interfaces would breathe new life into legacy nodes.
Mitigation strategies are needed for these new tiles and IPs. They would likely need to leverage multi-project wafers (MPWs) with prototype interposers, so the development costs may be higher. The interface count may also be extremely high, higher than the number of available tester channels. Therefore, more advanced strategies such as testing pairs of tiles together (high-pin-count interfaces facing each other) with a lower number of test pins coming out of the package to perform test and characterization.
Design for Test
As if DFT were not complicated enough, DFT in a heterogeneous 2.5D system will be even more complex. The key here is that in a given package, one of the die will be custom or programmable. Therefore, the test will need to be test-compressed into the custom die to test the overall system.
In order to keep the test time low, a large number of pins will need to be brought out to package balls, which may exceed the number of pins otherwise needed. For example, if the 512-bit memory interface is eliminated, leaving just SerDes and control pins, that may not be enough for test purposes. A significant quantity of pins that are dedicated to testing the packaged device may need to be inserted. This number may be in the order of 600 pins or so depending upon the complexity of the device. In existing monolithic solutions, the industry was able to mux test functions into large interfaces. If there are few large interfaces existing at the package level as the bulk of the interconnect remains within the package, then a new strategy will be needed.
In order to facilitate the overall test strategy of the package, the individual tiles may need to be configured to accept the test methodology from the custom ASIC. This requires planning not only by the architect, but also by the tile suppliers.
The body size of conventional packages with large memory interfaces was generally dictated by what fit all of the signal pins. In a 2.5D structure, it is likely that the area of the silicon tiles will dictate this, unless the number of test pins becomes a first order. Generally, the more test pins available the shorter the test time. As this may impact the package size, the determination of the number of test pins needs to include the impact to the package given that fewer pins are available for muxing test signals.
Test Issues and Strategies — Wafer Sort
Known good die (KGD), which few companies offer, have always been a problem for the industry. With 2.5D, the problem expands to known good tiles (KGT) and known good interposers (KGI). The fine pitch of the tiles and interposers in the 30-80um bump pitch range poses an added difficulty in achieving the known good status.
There will likely need to be far too many pins given the ultrawide interfaces that would be used on these die, so the likely approach is loopback at the probecard level as well as probing only a subset of pins.
Not probing every bump may sound like a very bad thing to do, but given the proper I/O design, much of the signal may be testable without contacting the bump itself, and this ties back to the tile reuse and architecture discussion mentioned earlier.
Understanding test coverage may be more important than trying to achieve known-good status. This will impact yield, and understanding the yield will help in understanding the overhead cost from the expected lower yield.
Other strategies for optimizing wafer sort include using some spare areas on the die for more conventional probe technology to contact a subset of pins. This reduces the cost of the probe cards and extends the hardware life.
Having different geometries for probing such as the probe pads used in the JEDEC HBM (JESD235) specification for the test or a larger bump that can be used for test and functional use would also be effective mechanisms to reduce probe hardware costs.
Test Issues and Strategies — Final Test
As mentioned earlier, there is a tradeoff between the number of test pins brought out versus package size and test time. Given that fewer pins come out of these 2.5D package systems, fewer pins are available for muxing test functionality.
The key is to drive the test through the custom or programmable die to test the other tiles in the package and return a pass or a failure with a signature of what failed. These need to be distinguishable, which is more difficult when testing several die through a subset of those die.
With this much integration, more time and heat dissipation may be needed for management during final test. This may impact the hardware and may change the junction temperature during the test process. These changes to the temperature during final test need to be well understood during characterization so that this does not impact the yield versus the expected yield for the test that comes later in the test process.
Characterizing die that are meant solely for use in die-to-die connectivity presents some unique challenges.
- The pin count in the highly parallel interfaces may be far greater than the pins available on the tester
- The drive strength of the interfaces (meant to reduce overall power) may not be strong enough to drive through a test package and loadboard since it was only meant to drive a millimeter or so to the neighboring die
Using loopback may be possible and may be attractive given the industry experience here, but may not be possible in all cases. The other approach would be to have two die communicate with each other; they would have to be selected from neighboring positions on a wafer to have a better likelihood of having the closest IDSAT (drain current in saturation) to each other, which will generally vary across a wafer. Characterizing the die in pairs may allow for improved ability to exercise the interfaces, but it then obfuscates the result: it only takes on die to fail to cause the pair to fail. If the die are of the same design from neighboring locations on a wafer, it may not matter which die is at fault, just that a fault exists.
In order to maximize yield for these tiles given the test-related difficulties, it is important to trace the process skews at the tile level with a ring oscillator. Having a temperature sensor on the die will also help in monitoring the effect of temperature in the test more accurately, as there will be longer test times. The temperature may be considerably higher for tests later in the test program than those at the beginning, even with temperature control on the handler.
Tiles procured externally may not have the same level of characterization as an ASIC device. When the externally procured devices have high-density interfaces, it is very likely that not all of those pins were tested, which may compound the testability issues in the packaged system. In this case, clearly understanding not only the die specification but also how the partner characterized the tile is important. Understanding this will help in designing a system test that covers the holes in the tile-level characterization.
The greatest challenge, without question, on 2.5D integration is yield management. With more integration of die, the yield will drop. Alternatively, if the ASIC is smaller than the equivalent monolithic ASIC, then the yield may be better. The best way to approach this is to establish a known yield in order to set expectations and margin accordingly. This requires a deep understanding of:
- The yield factors for the custom ASIC
- Yield of the tiles in the package
- Interposer and substrate yield
- Yield of the 2.5D assembly process
The assembly yield is the only thing the OSATs will take responsibility for. The testability of the interposer depends upon the technology used. The main industry approach of silicon interposers has the lowest testability, as there is no current standard for testing the full interconnect. Most players are making broad assumptions that the via yields will be high and the electrical quality can be established visually, which is quite a leap of faith. Other interposer technologies have better testability and require a lot less faith.
Isolating the individual die for characterization can be done even in these systems-in-packages as long as the die skews can be recorded with on-chip process monitors that can be read after the assembly is completed.
Unfortunately, ownership of yield is not as well defined. The type of business model will define who owns yield. If this is a turnkey process from a foundry or an integrator (such as eSilicon) the foundry or integrator has general yield responsibility. If a user decides to employ a disaggregated approach and interact with each member of the supply chain separately, then this ownership of yield will most certainly fall on the entity that is sourcing all of these services.
What does become easier is board-level yield management. The PCBs become simpler and the bulk of the hardware is tested prior to board level assembly.
In summary, the industry has come a long way in recent years to develop the fab and assembly processes needed for 2.5D integration. The need to develop the other key technologies and focus on maximizing the benefit of this technology is paramount. It is not the technology itself that is the breakthrough, but instead how it is used. This has to start at the architecture with all of the downstream planning outlined in this document.
The potential of not only higher processing capability through a >10X improvement in memory access, but also potential for simpler signal transmission though fiber optics connected to the 2.5D device and lower overall system cost will show that it is not a matter of whether this technology will take off, but instead, when. eSilicon has been developing comprehensive 2.5D capability for some time and has been working with current and future tile suppliers to enable the ecosystem, as well as developing the overall NPI process to ease the transition for our customers. This work is by no means complete, and this technology requires the whole village to raise it.