New Power Concerns At 10/7nm
October 12, 2017
As chip sizes and complexity continues to grow exponentially at 7nm and below, managing power is becoming much more difficult.
There are a number of factors that come into play at advanced nodes, including more and different types of processors, more chip-package decisions, and more susceptibility to noise of all sorts due to thinner insulation layers and wires. The result is that engineers now need to consider a slew of thermal, packaging and electromagnetic issues that were never serious concerns at previous nodes.
At a high level, these factors dramatically increase the importance of analyzing power from a system level, across multiple different operating scenarios and process corners.
“What is needed is the ability to analyze this large power data and use it to enable timely decisions that can impact the design process,” said Arti Dwivedi, principal technical product manager at ANSYS. “This stresses the need for data analytics-based power solutions with elastic compute and big data architectures.”
This requires more than just tooling, though. “At advanced nodes, low power and high performance designs is a big challenge not just for design styles but also for design flows,” said Jerry Zhao, product management director in the Digital & Signoff Group at Cadence. “At the architectural level, you need to be more power-efficient. But how to implement that pushes the challenges to the tools as to how the design will be analyzed and optimized. When we talk about process technology when designing to finFET. That’s a revolutionary change, given the 3D structures and all of the things that must be changed. Since 16nm, 10nm, 7nm, 5nm and now even 3nm, it’s more like incremental changes and enhancements. The density [of circuits in a design] is very high. Therefore the temperature within the design will be high, driven by activities such as watching videos.”
Below 16/14nm, some things remain the same, some are different. For example, there is little change in the management of clocks and power domains. And many design teams have acquired knowledge at 16/14nm about how to deal with self-heating, which is caused by heat trapped along the fins of the transistors.
At 10/7nm, though, some rules change related to power integrity, Zhao said. There are new concerns, such as how to manufacture vias and incorporate new technologies like pillars and bridges to make more efficient connections. These allow the resistance of the power grid to be controlled so that the overall IR drop and EM limitation will be improved. All of this is being worked into analysis tools.
Dynamic power effects
With higher pin capacitance and current densities in finFETs, the management of dynamic power, power noise and thermal issues becomes more difficult, Dwivedi observed. “Designers run emulation for real application scenarios, which can often range across billions of cycles. Cycle-based power profiling of these real application scenarios can provide actionable insights into events such as di/dt or sustained high average power, which can cause power integrity issues and thermal hotspots. RTL activity based current profiles can also be used to enable chip-package-system co-design.”
Further, dynamic power must be managed, she said. “Early focus on eliminating redundant switching can have a high impact on reducing dynamic power in the design. Clock power and glitch are two key areas of concern at 7nm and below.” Here, physical-aware RTL power analysis can help estimate clock power early and identify clock power hotspots in the design. Temporal analysis of clock, data, and clock-enabled activity provides visibility into redundant switching in the clock network, which can lead to large power wastage. And though the traditional techniques like register clock gating continue to be useful, we need to go beyond them to eliminate clock switching activity at a higher level in the clock network. Increased focus on block level and coarse clock gating can help improve clock power efficiency because glitches can constitute up to 40% of combinatorial power in designs. Therefore, reducing glitches is one of the critical needs in designs at 7nm and below. Early estimation of glitch power and identification of glitch-prone logic can guide the designers to datapath structures, which can be gated, balanced or retimed to reduce glitch.”
Fig. 1: Thermal coupling due to self-heating. SOURCE: ANSYS
In addition, self-heating brings with it temperature-related EM limitation changes, because when the temperature rises, the EM limitation drops, Zhao said. “This means it’s more likely to break down the road, which is driving design teams to think more carefully about reliability issues. Especially when considering medical devices or automotive applications, a failure in the field is going to have a direct impact on whether there is an accident that may be fatal. As such, design teams now must statistically calculate EM failures, namely the FIT (failure in time) calculation.”
These are all related to thermal, he stressed, which today is typically coupled with high performance specifications at the same time the voltage supply is dropping (while still consuming a lot of current), so the temperature goes up. And when the device is then put inside the package — yet another level of impact in terms of IR drop — analysis of the impact of all of these things is critical.
“This is why foundries want to use technologies such as 3D packaging, in order to continue adding more functions inside of one package—to ‘extend’ the life of Moore’s Law. That’s where we get into the packaging domain. And at the 7nm nodes, all of this must be considered because the manufacturing is hard enough as interposer approaches grow in popularity,” Zhao said.
From a system point of view, there’s not a lot that is unique to 10/7nm. And while fan-out packaging adds some new rules, it doesn’t change the fundamentals of chip design. But the packaging adds its own issues.
“Fan-out packaging is essentially thinner layers,” said Brad Brim, product engineering architect in the Custom IC & PCB Group at Cadence, said. “It’s RDL (redistribution layer) buildup layers, only you build a support and extend it beyond the area of the die. That changes rules, so we have to do some special things for extraction and layout design rules to make sure areas aren’t filled in with too much metal, etc. But this is not really unique to 7nm.”
The choice of package should be able to relieve some of the chip side issues, but this is not always the case, he said. “When using fan-out packaging, this will hurt you on thermal because it doesn’t have as much direct metal as it used to have. So you have to be more careful as you get larger and larger circuits that consume more power. This equates to more thorough analysis of the package and die.”
It also requires much more interaction between the chip and package design teams. “It really takes a system view of things to make the largest processor chips out there. A customer will ask how much power can be dissipated on a particular package. How hot does it get? You don’t know until you put it into a package. You are designing the package and the chip at the same time and making architectural levels for thermal and power delivery noise, such as, ‘Where do I put my capacitance? Do I put it on the die? Do I put it as close as I can or do I move it somewhere across the die and try to hook up to it through the package? Do I bury it in the package? Can I put it out on the board?’ This means the system teams are getting involved much earlier, and there is more cooperation across what used to be internal design boundaries.”
Zhao sees the same issues from the chip side. “In the old days, the chip guys didn’t talk too much to the package guys. I could give them a model and I could do my corner simulation, and that was fine. This is no longer the case. There is more and more interaction between the two sides, and a recognition of the need to communicate and use common sense to determine if the system is going to work. The chip guys cannot just point fingers at the package guys and say it’s their problem. This is not a dramatic change from 16nm to 7nm, but there is more and more cooperation between the two sides to make this flow more efficient.”
Fig. 2: Different packaging approaches. Source: STATS ChipPAC
Magdy Abadir, vice president of marketing at Helic, has observed this dynamic, as well. “Part of the problem is that today inside an SoC design, they design the analog pieces on one team, the RF pieces on another team, the cores somewhere else, the memories with another team. Then the digital guys are responsible for the whole SoC. They assemble everything together, and they can do extraction and LVS (layout vs. schematic), among other things. But nobody is responsible for checking if Block A and Block B on different hierarchies are talking to each other without being connected. But they could be interfering. One of them is switching, the other one is listening.”
This is giving rise to new issues to address in the form of electromagnetic crosstalk. Not to be confused with electromigration, electromagnetic crosstalk speaks to Maxwell’s equations, which are a cornerstone of electronic circuitry.
“Electromigration has to do with the fact that as technology shrinks, the wires are going to be very thin,” Abadir said. “The thickness of the wire might be a few atoms of metal, and as these things heat and change voltage and so on, there is a phenomenon called electromigration during which the atoms start moving. The atoms themselves move, whereas in electricity only the electrons move. When the atoms start moving on thinner wires, over time it breaks. So you get an open in the wires. People analyze for electromigration under different conditions and under different temperatures.”
Fig. 3: Using electromagnetic crosstalk analysis to debug silicon issues. Source: Helic
This is different than electromagnetics, which is becoming a problem at 7nm. As signals change at fast rates, a magnetic wave is produced.
“Motors are based on these concepts,” said Abadir. “We charge our phones by electromagnetic coupling without actually having a wire. The energy that travels in the air to charge my phone or that allows me to hear the sounds coming from a speaker — these are electromagnetic in nature. If they are not modeled, you don’t know how strong they are. They can travel through air or metal – any path – and when they reach something they will have an impact. These waves are different at different frequencies, too, because the electromagnetic waves themselves have a frequency with them. If an electromagnetic wave hits something at the frequency you’re operating, then it can have a negative impact. This is the concept of electromagnetic crosstalk.”
In the past, this wasn’t a significant issue. But at 10/7nm, it is no longer something that can be ignored.
“Electromagnetic crosstalk is now an issue at advanced nodes because the more shrinking you do, the greater the density of metal,” he explained. “The amount of metal increases, the amount of design components obviously increases, so the risk or likelihood of having crosstalk definitely increases as the technology nodes go down. This is a trend that has been happening, and will continue to happen, no matter what we do. In the past, design teams survived by ignoring the inductance completely. They may have analyzed inductance effects within very small components such as RF components or certain SerDes blocks or the PLLs or things of that nature. Within small blocks they may do the analysis and make sure that the design of these specific components are fine. However, interference between different blocks or across different hierarchies has never been considered until perhaps a little bit before 16nm. But at 16nm, 10nm and now 7nm it’s it’s becoming a necessity, not a luxury. Everybody has to worry about this.”
Data rates and frequency rises are only compounding problems here. “You would not design something at these nodes and have it operate slowly. The whole idea is to get more performance out of these circuits. With higher frequencies, electromagnetic waves get generated as a result of signals changing. The faster they change, the stronger the electromagnetic waves. That’s basically Maxwell’s equations, which say that as the frequency increases or data rates go from 10 gigabits to 40 gigabits to 100 gigabits, this is causing more interference such as crosstalk. If the interference gets strong enough, certain clock pulses or certain edges deteriorate to the point where you’re missing a clock or you get the wrong signal or the reset signal gets reset where it shouldn’t, or there is excessive power dissipation that was not expected,” he noted.
As a result, analysis is now needed to discover when there might be problems and determine what to do about them. This might mean changing the layout a bit, then re-doing the analysis.
“There are no simple rules at present for saying, ‘Do X and Y and Z, that’s it and you will be safe and the problems will go away.’ It requires doing analysis, it requires doing the simulation, and extracting these things in detail, and seeing if it is causing a problem, or the problem is tolerable or not,” he noted.
The number of challenges at advanced nodes continues to escalate. Some are extensions of existing problems, some are issues that could be ignored in the past. But at 10/7nm and below, the number of critical issues that needs to be addressed is rising, and from a power perspective they are additive.
As a result, designs will require much more power analysis, and that will have to happen at a system level because it involves multiple factors that have an effect on other parts of a system, sometimes in unexpected ways. So as features shrink, the number and criticality of problems grows, and what happens in one block or area of a design can have serious implications for another part of the system.
The article was originally published here