Newsletter / Issue No. 30

Image by Ian Lyman/Midjourney.

Newsletter Archive

31 Mar, 2025
navigation btn

Listen to Our Podcast

Dear Readers,

As artificial intelligence continues to reshape industries and everyday life, one of the biggest challenges it faces is its insatiable need for energy. A major driver of this need are the chips used to power new AI systems. Known as graphic processing units or GPUs, they are notoriously power-hungry, and improving upon or replacing them has become a major focus of new chip design. 

This week we look at how a new wave of companies are tackling this effort. Their solutions range from optimizing current GPU designs to developing entirely new kinds of chips to rethinking computation from the ground up — all with the goal of making AI computation less energy intensive. 

Also in this issue: 

  • A new and portable MRI machine could help save stroke victims
  • Wireless, high-speed data connections are being delivered through light beams
  • Software could speed up EV battery charging 
  • The rise of vibe coding, computer coding that requires no coding, or even coding knowledge!  
  • Thanks for reading, 

    Danielle Mattoon
    Executive Director, Aventine

    Subscribe

    Subscribe to our newsletter and be kept up to date on upcoming Aventine projects

    The Big Idea

    New Chips Could Reduce AI’s Need for Energy

    The computer chips that have enabled artificial intelligence to explode in competence come with a major downside: They are incredibly power-hungry. Responding to the world's growing demand for AI, many companies are working to address this problem with a new generation of chips — ones that will drive down energy use while maintaining or improving performance. 

    AI’s demand for power is driven by multiple factors. When it comes to training AI models, their size is important, as is the amount of data used to train them. Turning up both dials pushes the models to perform better, but also increases energy use. Then there’s inference — the technical term for making use of an artificial intelligence model, such as prompting a large language model (LLM) — which has exploded as a result of the chat interfaces offered by companies such as OpenAI and Anthropic. Morgan Stanley analysts have predicted that inference could account for more than three-quarters of all power use in U.S. data centers in the near future. All told, as Aventine wrote last year, some estimates suggest the AI industry may demand a total of 134 terawatt-hours per year by 2027 — about the same amount that the Netherlands uses annually.

    While efforts are underway to develop smaller, more data-efficient AI models that require less power both for training and inference, so far those models don’t perform as well as their larger siblings. In the meantime, both public and private companies are hoping energy savings could come from better chip design. This is a particularly promising target because the chips currently best suited to today’s AI workloads — the so-called graphics processing units (GPUs) that have made Nvidia, which supplies up to 95 percent of all AI chips, one of the world’s most valuable companies — are notoriously power hungry. “Each [GPU] card has the equivalent energy consumption of an oven in your kitchen,” said Michael Förtsch, CEO of a German photonic chip company called Q.ANT.  For context, Meta’s flagship frontier model, Llama 3, is trained on 24,576 GPUs that run for weeks or months. Put another way: Training Llama 3 was, in energy terms, like leaving more than 25,000 ovens switched on constantly for months. 

    So what might more efficient chips look like? The new approaches fall roughly into three buckets: Tinkering with the current GPU design; building new chips from scratch that work in fundamentally new ways; and rethinking our current approach to computing from the bottom up. Some approaches could deliver improvements in efficiency in the next couple of years; others are more akin to moonshots, but could bring the energy demands of computing down closer to zero if they succeed.

    Across the board, investors are keen to see this kind of technology made a reality. The money flowing into AI chip companies has surged in recent years, with $5 billion of venture capital investment flowing into such companies in 2024, almost three times as much as was invested in 2020, according to the data provider Dealroom. 

    Aventine spoke with experts from across the computer chip industry to better understand what’s happening with this new generation of chips. 

    Improving the chips we’re already using

    The majority of AI workloads are organized around many highly repetitive mathematical calculations being performed simultaneously, something “GPUs are really good at,” said Patrick Coles, chief scientist at Normal Computing, a startup that is attempting to employ the laws of thermodynamics to create energy-efficient chips. Top-end GPUs typically perform thousands of operations simultaneously, and trillions of them in total every second. (GPUs, by the way, get their name from their original purpose, which was performing the calculations required to generate realistic computer graphics.)

    Nvidia, along with incumbents like AMD and Intel and large-scale cloud computing platforms such as Amazon, Google, Microsoft and Meta, are all designing advanced AI chips aimed at performing calculations both faster and more efficiently than current GPUs. Google recently announced the sixth generation of its AI-focused chip, Trillium, which is about 1.7 times more energy efficient than its predecessor in performing tasks such as training the company’s Gemini 2.0 LLM; Amazon announced the second iteration of its AI chip, Trainium, which is supposed to be three times more efficient than the first version. 

    As a true sign of the times, mega companies are deploying AI chips to design better AI chips. Google DeepMind’s AlphaChip, for instance, has been used to inform the design of the company’s AI chips since 2020, and it seems plausible — if far from certain — that such approaches could eke out further progress for classical chips, such as reducing energy loss by suggesting chip layouts that minimize the distance that signals must travel. 

    In addition to efforts by Big Tech, a band of startups, including Cerebras, Groq and SambaNova, (the three of which have raised in the neighborhood of a billion dollars each, according to DealRoom), are also trying to improve on classical chip architectures, in particular by making inference more efficient. SambaNova claims that its chip can run LLMs 10 times faster than regular GPUs while also being 10 times more efficient; Cerebras promises inference that is 70 times faster than current GPUs while using a third of the power. These companies have all built and are selling chips that can replace those made by Nvidia, particularly for companies that are hosting AI models. Cerebras, for instance, has supplied chips for the French AI startup Mistral, an OpenAI competitor.

    Making an entirely new kind of chip

    Those numbers are striking. But other companies believe greater efficiencies could be found in a wholesale change to how chips are designed.

    One approach takes aim at AI’s insatiable need for data. The nature of training an AI model requires data to be moved from memory, onto a chip for computation, and then back to memory — a wasteful process energy-wise. “Each time that you need to fetch the data from the memory, you expend energy,” said Furnemont. “AI has increased that demand enormously.” In traditional computing, memory and processing are distinct pieces of hardware. In a new generation of chips, the idea is to combine the two functions in one place on a single chip. This idea is known as in-memory or near-memory compute, and numerous startups — such as d-Matrix, Rain AI, Vertical Compute and EnCharge AI — are working on making this a reality, building chips in which memory is located as close as possible to processing capacity. EnCharge, for instance, claims that its chips are 20 times more efficient than regular GPUs.

    Another, more radical, approach is to tackle the inherent inefficiency of using electricity for computation.  Whenever a current flows through any part of a traditional semiconductor (CMOS) chip, resistance causes power to dissipate as heat. On top of that, computer chips are made up of billions of transistors — effectively microscopic switches — that manipulate data in the form of 0s and 1s that are represented by voltages. Every time a transistor switches, it also dissipates power as heat, and this is intensified as the transistors are made to operate faster.

    A potential solution: Swap electricity for light wherever possible. This has a few big advantages. Light can be used to perform calculations important to some types of AI in more efficient ways than electrons and transistors in traditional chips can; light also has a far greater bandwidth than electricity, which means a chip built around it can simultaneously perform operations on massive quantities of data at the same time and create far less heat in the process. Until now, workloads have been too varied to make light-based chips worthwhile because such chips are useful for only certain types of operations.  But the rise of AI has brought specialized operations to the fore, prompting  companies such as Lightmatter in the U.S., Q.ANT in Germany and Lumai in the U.K. to develop hardware that takes advantage of light’s properties. Q.ANT claims that on text-recognition inference tasks, for instance, its hardware is 30 times more power efficient than comparative classical chips, and customers can already buy hardware that contains its technology. Light based — or photonic — systems can also theoretically be used to reduce electrical losses by replacing electrical interconnections between processors and memory, said Furnemont, though this technology is not yet broadly used commercially.

    Finally, there are efforts to move AI workloads away from centralized data centers and onto individual devices, which would require a new breed of chip, different from GPUs. So-called neuromorphic chips use architectures that mirror the structure of the brain. They’re made up of interconnected nodes that use power only when they’re activated by spikes of activity passing through the network. Because they are not always active, they require far less power than running models on a standard GPU on the cloud. Sumeet Kumar, CEO of the neuromorphic computing startup Innatera Nanosystems, a spinout of the Delft University of Technology, explained that such a chip could be placed in, say, a smart doorbell and used to detect the presence of a human on the device, rather than constantly sending data to the cloud to do the same job. 

    The biggest problem facing these new chip architectures is “predominantly manufacturability,” said Lawrence Lundy-Bryan, a partner at Lunar Ventures specializing in investment in compute startups. While the theoretical underpinnings of all these  approaches are well understood and it’s possible to build chips in small batches that work as proofs of concept, building large volumes of affordable chips requires processes that are highly repeatable, with high yields and few errors. The experience of a Texas-based startup called Mythic serves as an object lesson. The company, which raised $165 million to build in-memory chips, according to the British technology news site, The Register, promised that its hardware would use almost four times less power than conventional AI chips for inference. But it ran out of money in 2022. It has since acquired extra funding, but it stands as a cautionary tale of how difficult this work can be. The difference between a proof-of-concept and a commercial product is “such a big ramp up,” said Furnemont, one that demands huge spending.

    Rethinking computing from the ground up

    In the quest for more energy-friendly systems, some people are thinking even bigger. What if computing itself could be reimagined and built around efficiency? 

    That, pretty much, is what Vaire Computing, a startup based in Cambridge, U.K., wants to do. It’s betting that, at a very basic level, it can reinvent the logical operations performed by a computer to be less wasteful. Most computational operations involve taking two inputs and using them to create an output, irreversibly discarding the information held in the inputs. The laws of thermodynamics dictate that such an irreversible process results in an increase in entropy, which produces heat. “It literally throws away 100 percent of the signal energy on every single clock cycle,” said Mike Frank, senior scientist of the reversible computing startup Vaire. “It's like the worst possible way that you could design the computer.”

    Vaire wants to rethink that from scratch. “The energy used to represent a bit in a logic circuit, you can recover those energies and reuse them, as opposed to just dissipating them to heat,” said Frank. The startup is using ideas that have existed in research since the 1960s to develop ways to manipulate the signals inside chips in order to preserve information, theoretically driving down energy use of chips by more than 99 percent. Vaire is manufacturing its first prototype chips this year, and hopes to build a chip that it can test with customers by 2027, which will be “useful for people doing high performance AI compute,” said Frank. Ultimately, though, the company doesn’t see its work as a potential solution to the AI energy crisis, but rather as a solution to energy waste in computers everywhere. Reversible computing is an “underlying technology that can help power any digital system,” said Frank.

    Another startup, called Normal Computing, headquartered in New York, also believes it could leverage quirks of thermodynamics to make far more efficient chips. It is developing a system that uses variations in voltage on a chip as a result of temperature fluctuations to drive calculations while consuming virtually no power. The company has demonstrated that the technique works, using it to perform complex matrix calculation that could be useful in some AI contexts. “We don't necessarily want to try to compete with GPUs on matrix [multiplications], because GPUs are really good at that,” said Coles. “But they're less good at more complicated linear algebra tasks.” Like Vaire, it plans to have a prototype chip built this year and a working version that it can share with potential clients as soon as 2027. Coles left a position as a scientist at the Los Alamos National Laboratory, where he worked on quantum computing, to join Normal, having been frustrated over how long it was taking to build quantum computers. “The AI revolution is happening now, this decade,” he said, adding that speed of deployment was an important part of Normal’s approach.

    These approaches fall into the ”exotic" and “maybe-one-day” bucket, according to Lundy-Bryan. But they also hint at a future where chips are no longer pumping out the same amount of heat as a warehouse full of ovens.

    Chips in the near and long term 

    So what to make of this sprawl of technologies? How might they all fit together?

    First, we shouldn’t bet against existing technologies persisting for a while. “I continue to think there will be developments that will mean digital CMOS continues to have a lifespan at the cutting edge for the next 5 to 10 years,” said Lundy-Bryan. As for where new technologies fit in, several of the experts Aventine spoke with said that realistically we can expect the range of computing approaches used for AI workloads to diversify in the future, with certain specialized chips being used to perform specific kinds of tasks. “There are certain types of algorithms or functions where I still believe that CMOS has its advantage,” said Förtsch. “So there is no need to replace it totally. But on the other hand, I'm seeing that a lot of jobs are currently positioned on the CMOS architecture which should be positioned somewhere else.” 

    Coles suggests that in the future, compilers — the computer programs that convert code into instructions that computer hardware can follow — might be able to determine which tasks should be handed to GPUs, which go to photonic chips, and so on. But, he adds: “Honestly, no one has really come up with a coherent strategy for how to handle the fact that we have all these different next-generation computing paradigms.”

    Ultimately, existing computer architectures are highly energy inefficient for AI applications and an enormous amount of effort is being expended to change that. Of the new technologies taking shape, some will fail and some will prevail, but at this stage it is still too early to determine the winners and the losers. Even those that succeed will have to compete against their competitors, as well as with the incumbent CMOS chips that they’re taking on.

    “I think there will be a place for some of these new paradigms,” said Lundy-Bryan. “But they will not be replacements.”

    Listen To Our Podcast

    Learn about the past, present and future of artificial intelligence on our latest podcast, Humans vs Machines with Gary Marcus.

    Quantum Leaps

    Advances That Matter

    Taara terminals in Nairobi, Kenya deployed in 2019 Photo courtesy of Taara

    Google’s vision for superfast wireless internet delivered by laser. Sending 20 gigabits of data per second across 12 miles using invisible light beams might seem implausible, but it’s what Google’s X lab, often referred to as its moonshot factory, is trying to do. The company’s Taara Lightbridges use light to send data over long distances at high speeds in places where other options range from the highly impractical to impossible.  One example where this is already being done (albeit at lower speeds), Wired reports, is in Kinshasa, in the Democratic Republic of the Congo. Through lightbridge technology, the city is receiving access to high-speed data connection from Brazzaville, five kilometers away across a river in the neighboring Republic of the Congo. Lightbridge devices have also been used for Caribbean islands cut off from the internet because of failed undersea cables, Indian towns awaiting the deployment of 5G cellular networks and for the Coachella music festival, among other locations. Essentially, the technology could be useful anywhere a fiber connection is required but cable can’t be laid. In tests, the company has shown that it can send data at a speed of 10 gigabits per second over a distance of 1 kilometer, and it can tolerate brief disturbances to its line-of-sight connection, such as birds flying through the light beams, though fog is still a problem. A new chip, specifically designed for this technology, will allow the company to make the technology more robust. Like all of Google X’s moonshots, there is a chance Taara could be quietly shut down. It’s also expensive; while pricing isn’t publicly available, sources told Wired that an installation can currently cost tens of thousands of dollars. That said, it could provide a useful way to provide fast data connections when there isn’t a conventional alternative.

    A portable MRI machine could help save stroke victims. A scaled-down version of the massive and expensive MRI scanners found in hospitals could soon allow emergency rooms to diagnose life-altering brain conditions and make treatment choices more quickly. IEEE Spectrum reports that a company called Wellumio in Wellington, New Zealand, has developed a small mobile MRI scanner that is just large enough to fit over a patient's head. The device uses the same underlying principle as its larger cousins, using magnetic fields and radio waves to interact with hydrogen atoms in the body and form images. But to create a portable device, the magnetic fields had to be less than a tenth the strength of those used in most full-body scanners. While that means the device provides lower-resolution images, it can still be used to measure blood flow, allowing clinicians to identify whether a patient is suffering from an ischemic stroke, which is caused by obstruction of blood vessels in the brain. Since most strokes are either ischemic or hemorrhagic, being able to determine which is which is critical because the treatments are different. The company hopes to increase the resolution of the device over time, potentially making it useful in assessing other sorts of head injuries. The device is currently in preclinical trials at the Royal Melbourne Hospital in Australia, and the company hopes that ultimately it will be used in everyday medical settings, particularly in locations without access to a full-size MRI scanner. 

    Better software could dramatically speed up battery charging. A London-based startup called Breathe Battery Technologies has developed a new approach to managing battery charging that could increase charging speeds of electric vehicles by more than 30 percent. Currently, when an electric car is charged, software uses information about the temperature and charge level of a battery to determine how much electric current should be applied. The decision is made by consulting a table of values provided by the battery maker, but the values that are retrieved tend to be imprecise, especially outside of the most typical operating conditions. The result is an inefficient charging cycle. Instead, reports The Verge, Breathe takes an automaker’s batteries, tests them in extreme conditions, and uses the results to build algorithms that provide a detailed description of how best to charge a battery at any given moment. The company is working with Volvo to deploy the technology in a forthcoming vehicle, the ES90 sedan, which is due to go on sale later this year. The two companies claim that it can improve charge time by more than 30 percent, particularly in conditions in which batteries often struggle to operate, for instance at zero degrees Celsius. Breathe and Volvo claim the technology can improve charge time by as much as 48 percent. Breathe is also working on consumer device charging software, though it is unclear whether the gains could be as significant for such devices, which typically operate within more conventional temperature ranges.

    Long Reads

    Magazine and Journal Articles Worthy of Your Time

    Elon Musk put a chip in this paralysed man’s brain. Now he can move things with his mind. Should we be amazed or terrified? from The Guardian
    5,500 words, or about 22 minutes

    This story introduces us to an Arizona man called Noland Arbaugh who, in January of last year, became the first person to have a Neuralink chip implanted in his brain. Spoiler alert: Arbaugh’s implant isn’t working as it should. The device has come partially loose in his brain, which means that only 15 percent of the connections are actually effective. But that only makes what Arbaugh can do thanks to the implant more impressive. He can, for instance, simply think about where a computer cursor needs to move, and the cursor moves there. This allows him to do all sorts of other things, like type at 25 words a minute on an on-screen keyboard, browse the web, use social media, play games, and use a computer in countless other ways that most of us take for granted. He loves gaming, for example, and has learned to play a version of the video game Civilization VI faster than one of his friends. Nevertheless, there are notes of caution throughout this story about the implications of a privately held company controlling the technology inside the brains of an untold number of customers. Arbaugh is also living with the fact that, as the first recipient of a Neuralink device, he is the proverbial guinea pig, and likely experiencing more problems than subsequent users will. But despite all of that, Arbaugh seems thrilled by what the device can do and his opportunity to test it out.

    Will the future of software development run on vibes? from Ars Technica
    1,800 words, or about 7 minutes

    Earlier this year, Andrej Karpathy — one of the co-founders of OpenAI who has since left the company — coined a new term: vibe coding, an approach to writing code that has recently become a trend in Silicon Valley. Rather than studiously writing code with skill and precision, vibe coding, according to Karpathy, is all about asking an AI large language model to write code for you and then just go with the flow. Got an idea? Ask the AI to turn it into code. Does the code have a bug? Tell the LLM to fix it. Do you need to be able to actually read code to confirm that it works? Nope. “I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works,” wrote Karpathy. It should be noted that vibe coding is distinct from the practice of using LLMs to help write code, which has become common practice among developers. Vibe coding puts blind faith in the code that LLMs create without any oversight. Garry Tan, CEO of the startup accelerator Y Combinator, has said that startups built on vibe coding require a staff of 10 when they would previously have employed 50 to 100 people. But there are definite risks to this trend: If anyone can write code by asking an LLM to do it for them, there’s an enormous likelihood that code will be badly written and even dangerous. This piece from Ars Technica takes a look at how widespread this phenomenon has already become, the problems it creates, and how the approach could change what it means to be a programmer.

    The age of CRISPR, from The Economist’s Technology Quarterly
    9,200 words, over eight articles or about 37 minutes

    It’s just over a decade since the gene-editing tool known as CRISPR was invented, but it has certainly made its mark. The technology now shows potential for transforming our food supplies, eradicating diseases, eliminating the impact of invasive species, enabling transplantation of animal organs into humans and far more. Yet it also remains controversial, particularly when it comes to genetically engineering crops or — far more controversially — humans. This special report from The Economist takes a close look at the promises made by CRISPR and how likely it is to deliver on them, and prods at the ethical issues that continue to plague it. It’s a worthwhile opportunity to get up to speed on a technology that’s only going to become more pervasive over time.

    logo

    aventine

    About UsPodcast

    contact

    380 Lafayette St.
    New York, NY 10003
    info@aventine.org

    follow

    sign up for updates

    If you would like to subscribe to our newsletter and be kept up to date on upcoming Aventine projects, please enter your email below.

    © Aventine 2021
    Privacy Policy.