Local startup Trellis Data claims to have built the world’s fastest AI decoding technique, allowing text to be generated more than three times quicker on popular large language models.
New research from the Canberra-based company reveals its Dynamic Depth Decoding (D3) technique delivers an average speed boost of 44 per cent compared with the leading decoder on which it is based, EAGLE-2.
It translates to a 68 per cent reduction in computational power requirements at a time when AI is increasingly power-hungry and tech giants are turning to nuclear as a panacea for energy demands.
“D3 enables us to address one of the key bottlenecks of speed – the decoder – offering customers a reduction in the cost of running AI servers and a lower carbon footprint,” Trellis Data chief executive and co-founder Michael Gately said.
According to the research conducted by Trellis Data sister company ML Research Labs and the Australian National University, D3 optimises the existing EAGLE-2 algorithm by “making the beam search depth dynamic based on the draft confidence”.
EAGLE-2, which is widely considered the current baseline for fast decoding, works to accelerate LLM reasoning for runtime improvements without any loss of accuracy.
The research, which is currently undergoing peer review, shows D3 scored an average speedup of 3.16x across several tests on a single Nvidia A40 GPU, compared with EAGLE-2 and its predecessor, EAGLE.
D3 was tested using the same Vicuna (Vicuna-7B, Vicuna-13B) and Llama (LLaMA2-Chat 7B and LLaMA2-Chat 13B) models used for EAGLE and EAGLE-2, according to the paper.
Mr Gately, a former public servant who co-founded Trellis Data with is wife Rachel Gately in 2016, told InnovationAus.com that the company “employs a different method over the top of how EAGLE-2 was applying and just produce a better result”.
“All good things in AI… are incremental these days and we’re finding smarter ways to produce the same results, or produce better results even.”
The speed boost translates to a 68.4 per cent reduction in computational power requirements, which the company said will deliver a competitive advantage for customers as the size of LLMs and their use continues to grow, although this is not contained in the study.
Mr Gately said the “carbon footprint or energy rating of AI models” was important but had been “largely overlooked” by industry to date.
“It’s important that we start analysing if we can make AI models 100 times smaller, if we can make them 100 faster, and then we could use less computer and save rare earth minerals,” he said.
The company’s customers, including those in government and law enforcement, will be some of the first to gain access to the speed boost D3 offers across speech management and knowledge management, Mr Gately said.
Since it was founded in 2016, Trellis Data has raised almost $6 million and is now reportedly positioning itself for a listing on the ASX. CSIRO’s Main Sequence is its main external backer.
Over the last two years, the company expanded its footprint to Adelaide and Virginia in the United States. It also has a goal of creating up to 500 local jobs over the next five years.
In May, Mr Gatley told a Senate hearing that a ready-to-sign government contract was scrapped the day after the federal government’s Microsoft Copilot trial was announced.
Do you know more? Contact James Riley via Email.