After years of “software eating the world,” it’s hardware’s turn to feast. We previewed in the computation chapter of Tech Trends 2024 that as Moore’s Law comes to its supposed end, the promise of the AI revolution increasingly depends on access to the appropriate hardware. Case in point: NVIDIA is now one of the world’s most valuable (and watched) companies, as specialized chips become an invaluable resource for AI computation workloads.1 According to Deloitte research based on a World Semiconductor Trade Statistics forecast, the market for chips used only for generative AI is projected to reach over US$50 billion this year.2
A critical hardware use case for enterprises may lie in AI-embedded end-user and edge devices. Take personal computers (PCs), for instance. For years, enterprise laptops have been commodified. But now, we may be on the cusp of a significant shift in computing, thanks to AI-embedded PCs. Companies like AMD, Dell, and HP are already touting the potential for AI PCs to “future-proof” technology infrastructure, reduce cloud computing costs, and enhance data privacy.3 With access to offline AI models for image generation, text analysis, and speedy data retrieval, knowledge workers could be supercharged by faster, more accurate AI. That being said, enterprises should be strategic about refreshing end-user computation on a large scale—there’s no use wasting AI resources that are limited in supply.
Of course, all of these advancements come at a cost. Data centers are a new focus of sustainability as the energy demands of large AI models continue to grow.4 The International Energy Agency has suggested that the demands of AI will significantly increase electricity in data centers by 2026, equivalent to Sweden’s or Germany’s annual energy demands.5 A recent Deloitte study on powering AI estimates that global data center electricity consumption may triple in the coming decade, largely due to AI demand.6 Innovations in energy sources and efficiency are needed to make AI hardware more accessible and sustainable, even as it proliferates and finds its way into everyday consumer and enterprise devices. Consider that Unit 1 of the nuclear plant Three Mile Island, which was shut down five years ago due to economic reasons, will reopen by 2028 to power data centers with carbon-free electricity.7
Looking forward, AI hardware is poised to step beyond IT and into the Internet of Things. An increasing number of smart devices could become even more intelligent as AI enables them to analyze their usage and take on new tasks (as agentic AI, mentioned in "What's next for AI?" advances). Today’s benign use cases (like AI in toothbrushes) are not indicative of tomorrow’s robust potential (like AI in lifesaving medical devices).8 The true power of hardware could be unlocked when smarter devices bring about a step change in our relationship with robotics.
A generation of technologists has been taught to believe software is the key to return on investment, given its scalability, ease of updates, and intellectual property protections.9 But now, hardware investment is surging as computers evolve from calculators to cogitators.10 We wrote last year that specialized chips like graphics-processing units (GPUs) were becoming the go-to resources for training AI models. In its 2024 TMT Predictions report, Deloitte estimated that total AI chip sales in 2024 would be 11% of the predicted global chip market of US$576 billion.11 Growing from roughly $US50 billion today, the AI chip market is forecasted to reach up to US$400 billion by 2027, though a more conservative estimate is US$110 billion (figure 1).12
Large tech companies are driving a portion of this demand, as they may build their own AI models and deploy specialized chips on-premises.13 However, enterprises across industries are seeking compute power to meet their IT goals. For instance, according to a Databricks report, the financial services industry has had the highest growth in GPU usage, at 88% over the past six months, in running large language models (LLMs) that tackle fraud detection and wealth management.14
All of this demand for GPUs has outpaced capacity. In today’s iteration of the Gold Rush, the companies providing “picks and shovels,” or the tools for today’s tech transformation, are winning big.15 NVIDIA’s CEO Jensen Huang has noted that cloud GPU capacity is mostly filled, but the company is also rolling out new chips that are significantly more energy-efficient than previous iterations.16 Hyperscalers are buying up GPUs as they roll off the production line, spending almost $US1 trillion on data center infrastructure to accommodate the demand from clients who rent GPU usage.17 All the while, the energy consumption of existing data centers is pushing aging power grids to the brink globally.18
Understandably, enterprises are looking for new solutions. While GPUs are crucial for handling the high workloads of LLMs or content generation, and central processing units are still table stakes, neural processing units (NPUs) are now in vogue. NPUs, which mimic the brain’s neural network, can accelerate smaller AI workloads with greater efficiency and lower power demands,19 enabling enterprises to shift AI applications away from the cloud and apply AI locally to sensitive data that can’t be hosted externally.20 This new breed of chip is a crucial part of the future of embedded AI.
Vivek Mohindra, senior vice president of corporate strategy at Dell Technologies, says, “Of the 1.5 billion PCs in use today, 30% are four years old or more. None of these older PCs have NPUs to take advantage of the latest AI PC advancements."21 A great refresh of enterprise hardware may be on the horizon. As NPUs enable end-user devices to run AI offline and allow models to become smaller to target specific use cases, hardware may once again be a differentiator for enterprise performance. In a recent Deloitte study, 72% of respondents believe generative AI’s impact on their industry will be “high to transformative.”22 Once AI is at our fingertips thanks to mainstream hardware advancements, that number may edge closer to 100%.
The heady cloud-computing highs of assumed unlimited access are giving way to a resource-constrained era. After being relegated to a utility for years, enterprise infrastructure (for example, PCs) is once again strategic. Specifically, specialized hardware will likely be crucial to three significant areas of AI growth: AI-embedded devices and the Internet of Things, data centers, and advanced physical robotics. While the impact on robotics may occur over the next few years, as we discuss in the next section, we anticipate that enterprises will be grappling with decisions about the first two areas over the next 18 to 24 months. While AI scarcity and demand persist, the following areas may differentiate leaders from laggards.
By 2025, more than 50% of data could be generated by edge devices.23 As NPUs proliferate, more and more devices could be equipped to run AI models without relying on the cloud. This is especially true as generative AI model providers opt for creating smaller, more efficient models for specific tasks, as discussed in "What's next for AI?" With quicker response times, decreased costs, and greater privacy controls, hybrid computing (that is, a mix of cloud and on-device AI workloads) could be a must-have for many enterprises, and hardware manufacturers are betting on it.24
According to Dell Technologies’ Mohindra, processing AI at the edge is one of the best ways to handle the vast amounts of data required. “When you consider latency, network resources, and just sheer volume, moving data to a centralized compute location is inefficient, ineffective, and not secure,” he says. “It’s better to bring AI to the data, rather than bring the data to AI.”25
One major bank predicts that AI PCs will account for more than 40% of PC shipments in 2026.26 Similarly, nearly 15% of 2024 smartphone shipments are predicted to be capable of running LLMs or image-generation models.27 Alex Thatcher, senior director of AI PC experiences and cloud clients at HP, believes that the refresh in devices will be akin to the major transition from command-line inputs to graphical user interfaces that changed PCs in the 1990s. “The software has fundamentally changed, replete with different tools and ways of collaborating,” he says. “You need hardware that can accelerate that change and make it easier for enterprises to create and deliver AI solutions.”28 Finally, Apple and Microsoft have also fueled the impending hardware refresh by embedding AI into their devices this year.29
As choices proliferate, good governance will be crucial, and enterprises have to ask the question: How many of our people need to be armed with next-generation devices? Chip manufacturers are in a race to improve AI horsepower,30 but enterprise customers can’t afford to refresh their entire edge footprint with each new advancement. Instead, they should develop a strategy for tiered adoption where these devices can have the most impact.
For buying or renting specialized hardware, organizations may typically consider their cost model over time, the expected time frame of use, and the necessity for progress. However, AI is applying another level of competitive pressure to this decision. With hardware like GPUs still scarce and the market clamoring for AI updates from all organizations, many companies have been tempted to rent as much computing power as possible.
Organizations may struggle to take advantage of AI if they don’t have their data enablement in order. Rather than scrambling for GPUs, it may be more efficient to understand where the organization is ready for AI. Some areas may concern private or sensitive data; investing in NPUs can keep those workloads offline, while others may be fine for the cloud. Thanks to the lessons of cloud in the past decade, enterprises know that the cost of runaway models operating on runaway hardware can quickly balloon.31 Pushing these costs to operating expenditure may not be the best answer.
Some estimates even say that GPUs are underutilized.32 Thatcher believes enterprise GPU utilization is only 15% to 20%, a problem that HP is addressing through new, efficient methods: “We’ve enabled every HP workstation to share its AI resources across our enterprise. Imagine the ability to search for idle GPUs and use them to run your workloads. We’re seeing up to a sevenfold improvement in on-demand computing acceleration, and this could soon be industry standard.“33
In addition, the market for AI resources on the cloud is ever-changing. For instance, concerns around AI sovereignty are increasing globally.34 While companies around the world approved running their e-commerce platforms or websites on American cloud servers, the applicability of AI to national intelligence and data management makes some hesitant to place AI workloads overseas. This opens up a market for new national AI cloud providers or private cloud players.35 GPU-as-a-service computing startups are an alternative to hyperscalers.36 This means that the market for renting compute power may soon be more fragmented, which could give enterprise customers more options.
Finally, AI may be top of mind for the next two years, but today’s build versus buy decisions could have impacts beyond AI considerations. Enterprises may soon consider using quantum computing for the next generation of cryptography (especially as AI ingests and transmits more sensitive data), optimization, and simulation, as we discuss in "The new math."
Much has been said about the energy use of data centers running large AI models. Major bank reports have questioned whether we have the infrastructure to meet AI demand.37 The daily power usage of major chatbots has been equated to the daily consumption of nearly 180,000 US households.38 In short, AI requires unprecedented resources from data centers, and aging power grids are likely not up to the task. While many companies may be worried about getting their hands on AI chips like GPUs to run workloads, sustainability may well be a bigger issue.
Currently, multiple advancements that aim to make AI more sustainable are underway. Enterprises should take note of advancements in these areas over the next two years when considering data centers for AI (figure 2):
Finally, an infrastructure resurgence wouldn’t be complete without a nod to connectivity. As edge devices proliferate and companies rely on renting GPU usage from data centers, the complexities of interconnectivity could multiply. High-performance interconnect technologies like NVIDIA’s NVLink are already primed for communications between advanced GPUs and other chips.45 Advancements in 6G can integrate global terrestrial and non-terrestrial networks (like satellites) for ubiquitous connectivity, such that a company in Cape Town relying on a data center in Reykjavik has minimal lag.46
As The Wall Street Journal has noted, the AI transformation for enterprises is akin to the transition to electric that many car manufacturers are experiencing.47 Technology infrastructure needs to be rethought on a component-by-component basis, and the decisions made today around edge footprint, investment in specialized hardware, and sustainability can have lasting impacts.
If today’s hardware requires a strategic refresh, enterprises may have much more on their plates in the next decade when robotics become mainstream and smart devices become worthy of their label. Consider the example of the latest smart factories, which use a cascade of computer vision, ubiquitous sensors , and data to build machines that can learn and improve as they manufacture products.48 Instead of simply providing readings or adjusting on one parameter, like a thermostat, mesh networks of multiple AI-embedded devices can create collaborative compute environments and orchestrate diverse resources.49
Another form of smart factory is being developed by Mytra, a San Francisco–based company that simplifies the manual process of moving and storing warehouse materials. The company has developed a fully modular storage system composed of steel cubes, which can be assembled together in any shape that supports 3D movement and storage of material within, manipulated by robots and optimized through software.50 Chris Walti, chief executive officer of Mytra, believes this modular approach unlocks automation for any number of unpredictable future applications: “It’s one of the first general-purpose computers for moving matter around in 3D space.”51
Walti believes there is immense potential to apply robotics to relatively constrained problems, such as moving material in a grid or driving a vehicle in straight lines.52 Until now, in many cases, a good robot has been hard to find. Sustainability, security, and geopolitics are all salient concerns for such a technology. And that’s after we even muster the infrastructure noted earlier, including data, network architecture, and chip availability, to make such a leap forward possible. As the saying goes, “hardware is hard.”53 Over the next decade, advancements in robotics applied to more and more complex situations could revolutionize the nature of manufacturing and other physical labor. The potential leads directly to humanoid robotics—bots that are dynamic, constantly learning, and capable of doing what we do.
Economists and businesses alike have argued that aging populations and labor shortages necessitate greater investment in robotics and automation.54 In many cases, this entails large industrial robots completing relatively simple tasks, as noted above, but more complex tasks require “smarter” mechanical muscle that can move around as humans do. Take the example of Figure AI’s humanoid robots tested at the BMW plant in Spartanburg, South Carolina.55 The autonomous robot, through a combination of computer vision, neural networks, and trial and error, successfully assembled parts of a car chassis.56
As the furthest star of progress in this realm, we might anticipate humanoid robots performing a broad variety of tasks, from cleaning sewers to ferrying materials between hospital rooms or even performing surgeries.57 Just as AI is currently transforming knowledge work, the increased presence of robots could greatly affect physical work and processes in manufacturing and beyond. In both cases, companies should be sure to find ways for humans and machines to work together more efficiently than either could do alone. Labor shortages addressed by robotics should then free up human time for more of the uniquely creative and complex tasks where we thrive. As the author Joanna Maciejewska has said astutely, “I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes.”58.