Enabling Cheaper Design

BY BRIAN BAILEY, September 18, 2018 in Semiconductor Engineering

While the EDA industry tends to focus on cutting edge designs, where design costs are a minor portion of the total cost of product, the electronics industry has a very long tail. The further along the tail you go, the more significant design costs become as a percent of total cost.

Many of those designs are traditionally built using standard parts, such as microcontrollers, but as additional sophistication is creeping into edge devices for the IoT, demand is increasing for more computational ability beyond what simple microcontrollers provide.

Reasons why standard parts no longer provide an acceptable solution fill a list that continues to grow. In many cases, designs require custom content to reduce power consumption. Designs may also need higher levels of reliability or additional security beyond what standard parts offer.

Today, many of those parts never get built because of the economics of design. How much would the entire semiconductor market grow if design became cheaper? That does not mean a reduction in the cost of tools; it does mean providing greater productivity, even if it means giving up something else—such as area.


Fig. 1: Impact of NRE on total costs. Source – DARPA CRAFT

This issue is becoming important enough that DARPA started several projects under the (Circuit Realization At Faster Timescales) CRAFT umbrella in 2015. Their vision is to “sharply reduce the barriers to DoD use of custom integrated circuits built using leading-edge CMOS technology while maintaining the high level of performance at power promised by this technology.”

Much of the current DoD technology is based on standard parts, again pointing to NRE costs as being prohibitive for low volume parts. The program looked at projects such as BOOM-2 from the UC Berkeley RISC-V center as a proof on concept. Six graduate students designed a 25M transistor design in 6 months back in 2014 using languages and techniques not currently deployed in traditional EDA flows.

Compare that to a recent nVidia chip that was reported to have taken 8,000 staff years. “Not many companies can afford that,” points out Bryan Bowyer, director of engineering for the HLS group of Mentor, a Siemens Business. “Even for high-volume chips, the NRE cost is getting out of hand for everyone. The pressure is coming from everywhere.”

The attempt to reduce non-recurring engineering (NRE) is nothing new. “Since the beginning of customer-owned tooling/fabless models, there has been a continuous drive to reduce design costs,” says Tim Whitfield, vice president of strategy for Arm’s embedded and automotive business. “Beginning with the inception of high-level design languages (Verilog/VHDL) and logic synthesis, significant advances have been made to improve design quality, increase productivity, and ultimately drive down costs.”

One can also look at the tremendous productivity gains that have been enabled by Arm and other IP suppliers. But those gains got us to where we are today, and we need to go further.

Focus on the cutting edge
Traditionally, it has been the cutting-edge designs that have supported tool development. “While the focus is on the very big accounts, we also have a tremendously long tail of accounts that are looking for a lower cost and higher productivity solutions,” says Dave Pursley, product management director at Cadence. “They cannot afford to throw tons of manpower at it, they need to figure out a better way.”

Bowyer agrees, but also indicates that this may not be healthy under all circumstances. “Companies at the current edge are still the ones who have the most influence on the tools within EDA. That is perhaps a bit of a problem. We are training our tools by going to the cutting-edge companies and then letting the rest of the industry use those tools. There are opportunities to improve that process. There is a little less pressure on things like area when you get away from the cutting edge. It doesn’t make sense to spend too much time to optimize things that don’t cost much relative to the NRE.”

Abstraction
There is little disagreement that abstraction is at the heart of any improvement. “It makes sense to move up the levels of abstraction both in the hardware side and the software side,” says Pursley. “When you do that, you are writing fewer lines of code, which means there is less to verify, and you also have code that is more reusable across generations.”

But the adoption of abstraction has stalled for a large fraction of the industry. “While the level of automation and abstraction is significantly higher than 20 years ago, the complexity has grown significantly offsetting some of the advances,” admits Whitfield. “In terms of inexpensive design and the direction the industry is headed, there seems to be a greater focus on high-level abstraction for design, but if we can close the gap between the design description and something that is functionally verified and implementable in silicon, there is potential for wider adoption.”

High-level synthesis
The one bright area is high-level synthesis (HLS). “HLS allows you to abstract the design and that has become one of the techniques that many companies use,” points out Bowyer. “We have also seen an interest in a new class of IP that is a little more reconfigurable. Everyone wants design reuse but if you have to rejig the IP every time you respin a chip or go to a new process node, that gets in the way. HLS has an opportunity because most people don’t want to innovate on bus interfaces, so can a tool enable you to just connect pieces of IP through a collection of interfaces and have the assembly taken care of?”

Most adopters of HLS are using it to create optimized solutions. “With HLS, you can create multiple implementations and you can look at power, performance and area from one description,” adds Pursley. “So, you get productivity, but you also get the benefit of architectural exploration. There is always the assumption that you will give up something, like performance, power or area. Area directly equates to cost. The transforms inside the tool often mean that you do not have to give up any PPA. What you do have to give up is the methodology that you were using. You do give up some control.”

And there are examples where the usage of HLS has enabled some very fast tapeouts. “It took us only four months to go from spec to tape-out,” says David Garrett, vice president of hardware at Syntiant. “For the first two months of the design cycle, we used high-level synthesis to generate multiple implementations of every block, including speed, area, and power estimates from logic synthesis for each one. This enabled us to optimize the SoC by making trade-off decisions with hard numbers before pushing it through the RTL to GDS flow.”

Improved flows
The use of standard interfaces is one technique being adopted by an increasing number of companies. “There will be an inclination to start moving towards easy-to-use interfaces so that changes can easily be made at a higher level,” explains Prashant Varshney, group director, product management, Mentor, a Siemens Business. “Then, the integration of tools and automation will be preferred, which is what we are seeing for emerging markets, who want to start from a high-level of abstraction and have the tools do things automatically rather than them having to harden the IP at every stage of the flow.”

The idea of having semi-flexible puzzle pieces with a fixed set of interfaces is a strategy increasingly being adopted within the industry. “In order to make it fully plug and play, no matter what I do I am able to plug in all of these components, you do have to take care that the design is latency insensitive by using handshakes etc.,” points out Pursley. “If you are willing to do that, and some companies are to get the productivity benefits, then you can use that methodology.”

However, Pursley provides a warning. “Humans still want to see if they can remove three flops by getting rid of the handshake.”

Again, problems may be created by concentrating on the cutting edge. “We are realizing that training on cutting edge designs just gives you an expanded feature set,” adds Varshney. “While many of these may apply to other designs, you need something more. For example, a design at 180nm was using only three layers for routing, which you would never see in a cutting-edge application. However, this requires that you manage resources differently and things like the global routing engines have to be tuned differently.”

There is another big difference between low volume and cutting-edge designs. “When you have a cutting-edge customer, you bring them close to design closure and they are extremely happy,” continues Varshney. “They have people sitting there ready to do the last mile. But when you go to the other category of customer, they have one person doing the entire chip. If you have one DRC or one hold violation, they consider this to be a bug in the tool. The expectation is very different. This creates additional requirements for EDA tools to be automating to a greater extent than what was being done in the past.”

Machine learning is seen as one technology that may help to bridge the divide. “We would like to head in a direction where a user can take the learning from all of the cutting-edge designs and apply them to smaller or older geometries, or to lower volume designs,” says Bowyer. “That is a manual process today and machine learning may be able to help – let’s looks at the patterns of things that trip up tools or tool chains, start to understand and resolve them. Today, it is expected that at every major step along the way, there is a human that has to constrain the tools, that has to make changes or adjustments.”

Some are already seeing value in machine learning. “Machine learning in the design automation process will become more important, and Arm is already using machine learning techniques to accelerate its functional verification,” says Whitfield. “These techniques will help with synthesis and physical implementation, and potentially enable a push button route to optimal PPA trade-off.”

But one barrier is getting in the way of reducing NRE costs. “No matter what kind of company you are, everyone will perform as much verification as they can,” points out Pursley. “If you make verification faster, that is great, but that will not reduce the cost. They will spend as much as they need to and wish they could spend more. There is always more to do. It is not really a fixed cost, but it kind of is.”

However, there are verification advantages associated with HLS. “What does change on the verification side, is that you can be doing more of your verification at higher levels of abstraction,” adds Pursley. “HLS allows you to do more verification in C or SystemC. That ensures that what you are putting into the rest of the system is correct.”

Chiplets
Another promising direction is direct reuse of hard IP in the form of chiplets. “If you look at general-purpose IP, such as an application processor or microcontrollers, these are the ones that are highly optimized from the architectural level,” says Varshney. “Their implementation is done in a way that squeezes everything out of the process node. So, there is room for having these kinds of general purpose IPs to be hardened and there will be a market for that.”

There could be many optimizations possible within existing process nodes. “We will see if the industry can settle on a node for a long enough period of time such that the investment becomes worth it,” adds Bowyer.

Some companies are gearing up for it. “Consider advanced chip packaging techniques as well as 3D IC stacking with all its variants such as WOW, INFO, etc.,” says Magdy Abadir, vice president of marketing at Helic. “A key enabler to the cost, size, and performance advantages of these technologies can only be realized with the help of tools that can analyze the electromagnetic coupling between all the interacting metal layers, redistribution layers and packaging structures that are placed very close together.”

But other process optimizations are also possible, especially when considering domain specific applications. “Most designs include large spiral inductors,” adds Abadir. “Significant area reduction can be achieved by moving these large inductors on top of densely routed areas and capacitor banks. Tools are required to make sure no significant coupling would result between the inductors and the other structures underneath.”


Fig. 2. VCO folding example achieves significant area reduction. Source: Helic.

Will the long tail last?
The extent of the DARPA program accepts that there are no easy solutions to this problem otherwise they would have already been implemented. However, university and industrial projects have shown that with different tooling and methodologies, it is possible to create complex designs in a reasonable timeframe.

The big question for DARPA is if there is enough money to support the development of tools that target small volume products. “Will there be enough little IoT companies long term to keep this going?” asks Bowyer. “I am not sure but today there are. There has been a huge investment in small companies to do machine learning and IoT. So today, there is the push to support them.”