When I first arrived at FluxCorp, we were designing integrated circuits at the 130 nm node. The ‘130 nm’ refers to the smallest feature on the chip, typically the gate terminal of a MOS transistor, which is just above the 100 nm cusp of what is typically considered nanotechnology. Since then, we’ve moved on to designs in 90 nm, 65 nm, 45 nm, and now 28 nm as the struggle to keep pace with Moore’s prediction soldiers on. The challenges in scaling transistors isn’t just in fabrication and methods of lithography used to draw such tiny wires. Now firmly planted in the nano-world, circuit effects that used to be considered negligible are starting to rear their ugly head. One of those effects is electromigration, or EM.
The concept behind EM is quite simple. Too much electrical current through a wire causes high speed electrons to bump into atoms, which either gets knocked “downstream” or sputters off the wire altogether. As more atoms get moved, the wire becomes thinner, increasing its resistance and creating potential problems in circuit performance. Once enough atoms have been moved to create a break in the wire, the circuit malfunctions. The most common everyday example of EM failure is a light bulb burning out. On microchips, EM is one of two major time dependent failure mechanisms; the other is over-voltage. I’ve discussed both of these topics previously in my own blog at the Flying Flux (here, here, here, and here).
One of the more high profile EM failures is a January recall by Intel of it’s latest 6-series chipset. According to a PC World article, Intel said “it had stopped shipments of the chipset used with its latest generation of Core processors after it found a design flaw.”
Intel discovered a design issue in the 6-Series chipset, which is code-named Cougar Point and is used in systems with Sandy Bridge processors, which started shipping on Jan. 9. Intel said the Serial-ATA (SATA) ports within the chipsets could degrade over time, which could impact performance or functionality of storage devices such as hard drives.
A symptom of a faulty chipset could be bit-rate errors during data transfers, said Steve Smith, Intel vice president and director of PC client operations and enabling, during a conference call.
It’s unlikely that PCs would experience failures immediately, but aggressive data transfers over time could cause more errors, Smith said.
The failure signature where “aggressive data transfers over time could cause more errors” is consistent with an electromigration problem. But how can Intel, with its vast resources, let a problem like this slip through?
Designing for robust EM performance is quite straightforward; use a wide enough wire and EM is no longer an issue. However, the desire for good EM performance may conflict with the desire for good circuit performance, as wider wires lead to higher parasitic capacitances. So simply oversizing wires everywhere isn’t a solution. And despite the financial implications of designing a chip with poor EM reliability, as the Intel case shows, robust EM analysis of integrated circuits to catch weak spots is something relatively new. Back when the transistor gates were above 100 nm, EM wasn’t much of a concern because wires can only get so thin. Once designs fell more and more below 100 nm, thinner wires were not longer just a potential EM failure concern; they became more and more an EM failure likelihood.
Digital integrated circuit design, with its much heavier utilization of automation tools, lead the way in EM analysis. Even so, commercial tools were only starting to take shape five years ago. Not much was offered for the analog designers, however. We were simply told to be careful when sizing our wires. Meanwhile, FluxCorp started development of an in-house EM software tool for our analog designers. The initial tools were buggy and quite inefficient. For 90 nm transistors, we were told the tool was available to be used if we were interested. For 65 nm designs, with improvements in computational methods, EM checking became strongly recommended. With additional bug fixes, EM checking became a hard requirement for 45 nm designs.
The Intel 6-series chipset is using 65 nm technology. It’s also likely that Intel outsourced the problem circuit in question (the SATA port used to talk to hard drives). If FluxCorp is anything to go by, the third-party vendor probably did not have a robust EM checking methodology in place when they delivered the 65 nm design to Intel 2-3 years ago. Fast forward to today; the vendor now has a much better EM checking methodology in place. This allows them to go back and check for potential issues in past designs. Lo and behold, they found some in their 65 nm designs. Unfortunately, Intel is now in full production and has started to ship parts to customers. A major embarrassment and a deep hit to the pocketbook for both Intel and the vendor.
As microelectronics scale to smaller and smaller geometries, more of these difficulties will keep analog IC designers both frustrated and busy, but at the very least, still employed.