AI & ML


From the editor's desk: Groq – the future of AI processing?

28 March 2025 AI & ML


Peter Howells, Editor

For the past few years, the world has been hit by a storm of AI-generated information, mostly using generative pre-trained transformer (GPT) models to perform the AI inferencing. These models are excellent at performing large language model (LLM) requests, but they do have one drawback. The response time or lag is noticeable. This is largely due to the hardware that these models are being processed on, namely GPUs.

Many of the GPUs running AI models in professional data centres are Nvidia’s A100 (or similar) series which contain thousands of CUDA cores, many more than the handful of processing cores in a standard CPU. These CUDA cores work together to answer the language requests directed at them as they are designed for parallel processing and are optimised for tasks like scientific simulations. But are they really optimised?

Groq seems to be the new kid in all this AI hoopla, but they have been around since 2016 when the company was founded by a group of former Google employees led by Jonathan Ross, one of the designers of the Tensor Processing Unit (TPU), and Douglas Wightman, an engineer at Google X. The TPU is an AI accelerator Application-Specific Integrated Circuit (ASIC), a custom-designed chip tailored for a specific task. These ASICs offer optimised performance and efficiency compared to general-purpose processors.

And this is where the story gets exciting. The Groq AI model runs on ASICs as opposed to GPU architecture to deliver similar responses to the current slew of GPT models in use. Groq’s architecture is developed to expedite machine learning workloads, providing unparalleled speed and efficiency. This is a big deal – Groq needs much less energy to answer these same requests, and more importantly, does it with seemingly no lag. This last property is down to the speed at which the ASICs perform these ‘application-specific’ tasks.

Real-world testing by myself bears this out. When asking exactly the same technical question to both chatGPT 3.5 and 4.0 models and also to Groq, and then comparing the response times, I can without a doubt say that Groq certainly has minimal lag compared to the GPT models. The information produced in the responses is presented in a different format, but compares favourably with each other. Groq’s response is almost immediate whereas the other models take a few seconds before beginning to display an answer.

The introduction of Groq’s ASIC-based approach to AI inferencing marks a significant shift in the landscape of LLMs. By prioritising speed and efficiency, Groq is challenging the current dominance of GPU-driven AI, offering near-instantaneous responses while consuming less power. As AI applications continue to expand, this technology could redefine the way we interact with AI systems, setting a new benchmark responsiveness.

Whether this signals a broader industry shift remains to be seen, but one thing is clear – Groq has introduced a compelling alternative that demands attention.


Credit(s)



Share this article:
Share via emailShare via LinkedInPrint this page

Further reading:

NXP has expanded its MCX A Series
Altron Arrow AI & ML
NXP has significantly expanded its MCX A Series of Arm Cortex-M33 microcontrollers, doubling the portfolio with six new families aimed at industrial and IoT edge applications.

Read more...
From the editor's desk: Engineering the future
Technews Publishing Editor's Choice
As we welcome the first issue of Dataweek in a new year, it is an exciting time to be part of the electronics community, especially for our readers. The pace of change across our industry continues to accelerate, reshaping how we design, build, and interact with technology.

Read more...
From the editor's desk: Could X-ray lithography disrupt the economics of advanced chip manufacturing?
Technews Publishing Editor's Choice
Advanced semiconductor manufacturing has reached a point where technical progress is increasingly constrained by economic reality, and the proposed use of X-ray lithography represents a bold attempt to reset these economics.

Read more...
AI-ready embedded SBC
AI & ML
The new Grinn GenioBoard SBC provides a production-ready implementation of a powerful eight-core MediaTek processor, backed by high-speed interfaces, a Linux distro, and CRA-ready security software.

Read more...
Alif Semiconductor elevates generative AI at the edge
AI & ML
Developers can now use the ExecuTorch Runtime for AI applications built to run on its Ensemble E4/E6/E8 series of MCUs and fusion processors.

Read more...
From the editor's desk: Resilience and innovation in South Africa’s electronics sector
Technews Publishing Editor's Choice
For South Africa in particular, 2025 has been a year that highlighted the resilience and adaptability of our engineering community as we navigated shifting technologies and a fast-moving international landscape

Read more...
Is it time for Wi-Fi 7 in SA?
Technews Publishing Editor's Choice Telecoms, Datacoms, Wireless, IoT
Wi-Fi 7, the IEEE 802.11be standard also known as Extremely High Throughput, is the next-gen wireless networking standard designed to dramatically improve speed, latency, efficiency, and reliability.

Read more...
Questing for the quantum AI advantage
Editor's Choice AI & ML
Two quantum experts disclose high hopes and realities for this emerging space.

Read more...
How a vision AI platform and the STM32N6 can turn around an 80% failure rate for AI projects
Altron Arrow AI & ML
he vision AI platform, PerCV.ai, could be the secret weapon that enables a company to deploy an AI application when so many others fail.

Read more...
Infineon’s OPTIGA for more secure AI and ML models
Future Electronics AI & ML
Infineon Technologies provides its OPTIGA Trust M security solution to Thistle Technologies for embedded computing products.

Read more...









While every effort has been made to ensure the accuracy of the information contained herein, the publisher and its agents cannot be held responsible for any errors contained, or any loss incurred as a result. Articles published do not necessarily reflect the views of the publishers. The editor reserves the right to alter or cut copy. Articles submitted are deemed to have been cleared for publication. Advertisements and company contact details are published as provided by the advertiser. Technews Publishing (Pty) Ltd cannot be held responsible for the accuracy or veracity of supplied material.




© Technews Publishing (Pty) Ltd | All Rights Reserved