Deloitte predicts that the market for specialized chips optimized for generative AI will be over US$50 billion in 2024. From close to zero in 2022, that sum is expected to make up two thirds of all AI chip sales in the year. Deloitte predicts that total AI chip sales in 2024 will be 11% of the predicted global chip market of US$576 billion.1 Recent AI chip market forecasts in 2027 range from an aggressive US$400 billion to a more conservative US$110 billion. For several reasons, the more conservative estimates may be more realistic.2
On the other hand, there are those who fear a generative AI chip bubble: Sales will be massive in 2023 and 2024, but actual enterprise generative AI use cases could fail to materialize, and in 2025, AI chip demand could collapse, similar to what we’ve seen in crypto mining chips in 2018 and 2021.3
Even at the lower end of the nonbubble range, AI chips would be a very large part of the semiconductor market, and a needed tailwind given anticipated sluggish demand from traditional stalwarts such as smartphones4 and PCs5 and more mature data center chips.
At a high level, generative AI is similar to many other kinds of AI used in recent years: It’s a form of machine learning, specifically deep learning combined with neural networks. But there are important differences, and the major chip companies as well as others have built or are building chips that are specially optimized for generative AI, as older AI chips are too slow or too inefficient to do it well and are lacking the right kind of design and memory.6
In spring 2023, the more advanced special chips were selling for about US$40,000 each.7 There was strong demand for a million or more chips. Coveted chips were in severe manufacturing shortage and allocation (mainly due to a lack of advanced packaging capacity), and were a chokepoint for the rollout of generative AI offerings from thousands of companies.8 Many of the companies that make these chips can’t make them fast enough, and that imbalance is expected to continue well into 2024.9 Demand is high, supply is constrained, and prices are high.
Perhaps even more important are the geopolitical implications: The special generative AI chips require a host of advanced technologies from all over the world, although they are fabricated mostly in Asia now and are likely to be highly concentrated there in the future. These chips are increasingly subject to trade restrictions from the US, Europe, and their Asian allies for China and Russia.10 Although China can develop its own generative AI data sets and software, it may be more difficult for the country to buy or make the most advanced chips needed for cutting-edge AI processing in the next five years. It’s also unclear to what extent China may have been able to advance its chip production in spite of restrictions. For example, in September of 2023, a Chinese chip manufacturer produced a chip (for smartphones) that was made on 7nm process node. The chip was both smaller and two to three generations behind the processes used by leading generative AI chips, but it was closer than most Western analysts believed possible, given Western sanctions.11
At the heart of the current state of the art generative AI hardware is a rack-scale board made up of different kinds of chips and interconnections.
Those boards are a mix of central processing units combined with a specialized (very large and very advanced process node) graphics processing unit (GPUs) that is in a special kind of packaging with a special kind of high speed memory.12 In one example, the GPU is a piece of silicon called a die of over 800 mm² (which is very large), consisting of 80 billion transistors, packaged with a very large, very fast, high bandwidth memory (HBM3) in what is called 2.5D advanced packaging.13 This can be done either at the end of the foundry process or at the start of the back-end assembly and test process by an outsourced assembly and test player.14
Inside the data centers where most of these generative AI accelerators will be located, it’s often necessary to move large chunks of data over short distances as rapidly as possible, using special networking chips.15 These communications chips are not used solely for generative AI applications, but generative AI is one of the biggest drivers for their use at present, and likely worth single digit billions of dollars in 2024.16
Finally, generative AI chips use a lot of power, approximately 10 kilowatts per board, and multiple units would produce far more heat than air cooling could cope with, so the market for liquid cooling is likely to be US$2 to 3 billion in spending in 2024, growing at about 25% annually.17 Those high power draws also could require new high voltage power supplies.18 Using higher voltages could offer significant efficiency gains and is likely a sub billion dollar annual market across a number of smaller players.19
Deloitte is relatively confident in our prediction for the 2024 generative AI-driven market opportunity of about US$50 billion, but what happens after the current high demand and high prices are met by higher supply and new entrants is unclear.
The 2027 numbers mentioned earlier (up to US$400 billion) are potentially important to the global semiconductor industry, and from reputable sources, but there are several reasons why they might be too optimistic.
First, the summer 2023 market for generative AI GPUs is marked by having essentially a single designer, who in turn relies on a single supplier that is capacity constrained.20 Meanwhile, buyers are trying to secure as many chips as they can to build processing capacity for anticipated consumer and enterprise use of generative AI training and inference.21 As a result, pricing could be roughly as high as it might ever be. As that supplier builds more capacity, or as new competitors enter the market, prices are more likely to decline than to stay where they are, thus impacting revenues for 2025 and beyond.
Second, when chip customers are on allocation and are unlikely to receive complete orders, they often over order. Knowing that orders are being cut back by 75%, for instance, though they’d need 25,000 chips, some might ask for 100,000 chips, rather than the 25,000 which is their “true demand.” Once supply and AI chip demand are more in balance, buyers may get more chips than they need and then pull back just as new capacity is coming online. This is part of the semi “bullwhip effect” and a contributor to the extreme cyclicity the chip industry has historically seen.22
Third, currently all training and almost all generative AI inferences are done using the same data center generative AI chips. But it’s likely that over time a significant part of the generative AI inference will be done on edge processors.23 These could be smaller GPUs or CPUs or new application-specific integrated circuits and could come from existing generative AI chip companies, or new entrants, including both traditional edge processing chip companies, but also companies not traditionally known for designing chips.24 Doing more at the edge could expand the market, or cause prices of data center generative AI chips to fall.
Finally, as mentioned earlier, there are those who worry about a generative AI chip bubble, with a robust 2023 and 2024 and a weaker 2025: This view is not consensus, but worth being mindful of, given the possibility of boom and bust.
While it’s hard to say with certainty, it may be probable that a combination of a higher and more diversified supply, lower than predicted AI chip demand, moving inference to edge processors, and lower prices could make the 2027 AI chip market closer to the lower end of the potential US$110-400 billion range—still more than double 2024 levels.
Regardless of whether the market is US$100 billion or US$400 billion, companies are likely to need AI chips, especially generative AI chips, and to regard secure supply and reliable supply chains as critical for innovation, economic success, and national security.
And herein lies a challenge for the United States and Europe. Although multiple chipmakers are building advanced node plants capable of making cutting edge CPUs and GPUs for AI and generative AI,25 there is not enough existing packaging capacity in Europe or the United States from either front end or back end companies.26 Equally, there are no significant existing or planned HBM or HBM3e plants in either the United States or Europe.27 Although the generative AI dies could be made domestically, they would likely need to be sent to Asia (Southeast Asia, South Korea, or Taiwan) for both the HBM3 memory and the advanced packaging portions of the process.
Both the European Chips Act and the US CHIPS and Science Act have money set aside for advanced packaging and advanced memory investments,28 but it is unclear if it will be enough for each region to become self-sufficient in packaging generative AI chips.
The final implication of the growth in generative AI chips is China. At present, the United States, Netherlands, and Japan all have export controls in effect that prevent China from purchasing advanced node chips of all types, including generative AI chips, as well as the know-how to do so.29 Amid concerns that future export controls may target less advanced chips,30 leading Chinese internet companies ordered US$5 billion of generative AI chips in August 2023, ahead of further potential US restrictions.31
If generative AI will be as important to innovation, economic growth, and national security in 2027 as it appears to be today, and if China is restricted from purchasing advanced AI chips and the tools needed to build their own advanced chips, it could have further impact on global economics, including the potential for raw material export restrictions of needed elements (see raw materials prediction) and other negative effects that could impact global growth.