Nvidia rivals focus on building a different kind of chip to power AI
products
Send a link to a friend
[November 20, 2024] By
MATT O'BRIEN and BARBARA ORTUTAY
SANTA CLARA, Calif. (AP) — Building the current crop of artificial
intelligence chatbots has relied on specialized computer chips pioneered
by Nvidia, which dominates the market and made itself the poster child
of the AI boom.
But the same qualities that make those graphics processor chips, or GPUs,
so effective at creating powerful AI systems from scratch make them less
efficient at putting AI products to work.
That's opened up the AI chip industry to rivals who think they can
compete with Nvidia in selling so-called AI inference chips that are
more attuned to the day-to-day running of AI tools and designed to
reduce some of the huge computing costs of generative AI.
“These companies are seeing opportunity for that kind of specialized
hardware,” said Jacob Feldgoise, an analyst at Georgetown University's
Center for Security and Emerging Technology. “The broader the adoption
of these models, the more compute will be needed for inference and the
more demand there will be for inference chips.”
What is AI inference?
It takes a lot of computing power to make an AI chatbot. It starts with
a process called training or pretraining — the “P” in ChatGPT — that
involves AI systems “learning” from the patterns of huge troves of data.
GPUs are good at doing that work because they can run many calculations
at a time on a network of devices in communication with each other.
However, once trained, a generative AI tool still needs chips to do the
work — such as when you ask a chatbot to compose a document or generate
an image. That's where inferencing comes in. A trained AI model must
take in new information and make inferences from what it already knows
to produce a response.
GPUs can do that work, too. But it can be a bit like taking a
sledgehammer to crack a nut.
“With training, you’re doing a lot heavier, a lot more work. With
inferencing, that’s a lighter weight,” said Forrester analyst Alvin
Nguyen.
That's led startups like Cerebras, Groq and d-Matrix as well as Nvidia's
traditional chipmaking rivals — such as AMD and Intel — to pitch more
inference-friendly chips as Nvidia focuses on meeting the huge demand
from bigger tech companies for its higher-end hardware.
Inside an AI inference chip lab
D-Matrix, which is launching its first product this week, was founded in
2019 — a bit late to the AI chip game, as CEO Sid Sheth explained during
a recent interview at the company’s headquarters in Santa Clara,
California, the same Silicon Valley city that's also home to AMD, Intel
and Nvidia.
“There were already 100-plus companies. So when we went out there, the
first reaction we got was ‘you’re too late,’” he said. The pandemic's
arrival six months later didn't help as the tech industry pivoted to a
focus on software to serve remote work.
Now, however, Sheth sees a big market in AI inferencing, comparing that
later stage of machine learning to how human beings apply the knowledge
they acquired in school.
“We spent the first 20 years of our lives going to school, educating
ourselves. That’s training, right?” he said. “And then the next 40 years
of your life, you kind of go out there and apply that knowledge — and
then you get rewarded for being efficient.”
[to top of second column] |
Employees work behind an evaluation board, foreground, in a lab at
the d-Matrix office in Santa Clara, Calif., Wednesday, Oct. 16,
2024. (AP Photo/Jeff Chiu)
The product, called Corsair,
consists of two chips with four chiplets each, made by Taiwan
Semiconductor Manufacturing Company — the same manufacturer of most
of Nvidia's chips — and packaged together in a way that helps to
keep them cool.
The chips are designed in Santa Clara, assembled in
Taiwan and then tested back in California. Testing is a long process
and can take six months — if anything is off, it can be sent back to
Taiwan.
D-Matrix workers were doing final testing on the chips during a
recent visit to a laboratory with blue metal desks covered with
cables, motherboards and computers, with a cold server room next
door.
Who wants AI inference chips?
While tech giants like Amazon, Google, Meta and Microsoft have been
gobbling up the supply of costly GPUs in a race to outdo each other
in AI development, makers of AI inference chips are aiming for a
broader clientele.
Forrester's Nguyen said that could include Fortune 500 companies
that want to make use of new generative AI technology without having
to build their own AI infrastructure. Sheth said he expects a strong
interest in AI video generation.
“The dream of AI for a lot of these enterprise companies is you can
use your own enterprise data,” Nguyen said. “Buying (AI inference
chips) should be cheaper than buying the ultimate GPUs from Nvidia
and others. But I think there’s going to be a learning curve in
terms of integrating it.”
Feldgoise said that, unlike training-focused chips, AI inference
work prioritizes how fast a person will get a chatbot's response.
He said another whole set of companies is developing AI hardware for
inference that can run not just in big data centers but locally on
desktop computers, laptops and phones.
Why does this matter?
Better-designed chips could bring down the huge costs of running AI
to businesses. That could also affect the environmental and energy
costs for everyone else.
Sheth says the big concern right now is, “are we going to burn the
planet down in our quest for what people call AGI — human-like
intelligence?”
It’s still fuzzy when AI might get to the point of artificial
general intelligence — predictions range from a few years to
decades. But, Sheth notes, only a handful of tech giants are on that
quest.
“But then what about the rest?” he said. “They cannot be put on the
same path.”
The other set of companies don’t want to use very large AI models —
it’s too costly and uses too much energy.
“I don’t know if people truly, really appreciate that inference is
actually really going to be a much bigger opportunity than training.
I don’t think they appreciate that. It’s still training that is
really grabbing all the headlines,” Sheth said.
All contents © copyright 2024 Associated Press. All rights reserved |