Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Enterprises have many different AI models to choose from and sometimes will need to use multiple models together. But how can an enterprise automatically select the best model, based on the task and the cost?
That’s the challenge that AI startup Martian is aiming to solve with its LLM router technology. Martian competes against a number of other model router startups including Not Diamond which launched back on July 30.
Among the many organizations looking to optimize enterprise AI model usage is Accenture, which today announced that it is investing in Martian, though it is not revealing the specific amount. Accenture has a growing platform of AI services and partnerships as it seeks to capture enterprise interest and demand. Accenture is set to integrate Martian into its switchboard services, which helps enterprises to select models. Martian emerged from stealth in November 2023 and has been steadily growing its technology over the past year. Alongside the Accenture deployment the company is also rolling out a new AI model compliance feature as part of its router platform.
The Accenture switchboard to date has helped organizations to select models for enterprise deployment. What Martian adds into the mix is the ability to do dynamic routing to the best model.
“We can automatically choose the right model, not even on a task by task basis, but a query by query basis,” Shriyash Upadhyay, co-founder of Martian, told VentureBeat. “This allows for lower costs and higher performance, because it means that you don’t always have to use a single model.”
In a statement Lan Guan, chief AI officer at Accenture commented that many of Accenture’s clients are looking to reap the benefits of generative AI in a way that considers requirements, performance and cost.
“The capabilities of Accenture’s switchboard services and Martian’s dynamic LLM routing simplify the user experience and will allow enterprises to experiment with generative AI and LLMs in order to find the perfect fit for their business needs,” Guan stated.
How Martian routes enterprise AI queries to the best model
Martian builds model routers that can dynamically select the best model to use for a given query.
The core technology behind the router focuses on predicting model behavior.
“We take a relatively unique approach in doing this, where we focus on trying to understand the internals of what’s going on inside of these models,” Upadhyay said. “A model contains enough information to predict its own behavior, because it does that behavior.”
The approach allows Martian to select the single best model to run, optimizing for factors like cost, quality of output and latency. Martian uses techniques like model compression, quantization, distillation and specialized models to make these predictions without needing to run the full models. The Martian routing system can be integrated into applications that use language models, allowing it to dynamically choose the optimal model to use for each query, rather than relying on a single pre-selected model. This helps improve performance and reduce costs compared to static model selection.
Why model routing should be an enterprise AI imperative
The idea of using the best tool for the job is a common business idiom, but what isn’t as common is the knowledge in organizations that there are lots of very specific choices for AI.
“Often these large companies might have different organizations where some part of the org doesn’t even know about the fact that there is this whole world of different models out there,” Upadhyay said.
In order to actually use AI models effectively, Upadhyay emphasized that defining success metrics is critical. Organizations need to determine what are the metrics that actually define success and what does the organization actually care about in a specific application.
Cost optimization and return on investment are also critical. Upadhyay noted that organizations need to be able to optimize costs and be able to demonstrate some form of return on investment for model deployment. In his view, those are areas where model routing is essential as it serves both purposes.
Compliance is always a concern in an enterprise and that’s an area that Martian is now taking on with its model router. The new compliance feature in Martian helps companies vet and approve AI models for use in their applications. Upadhyay said that the feature will allow companies to automatically set up a set of policies for compliance.
Enterprise AI model router could be a boon for Agentic AI
One of the driving use cases for AI model routing in enterprise use cases is the growing area of agentic AI.
With agentic AI, an AI agent will chain together multiple models and actions in order to achieve a result. Each step in an agent workflow depends on the previous steps, so errors can compound exponentially. Martian’s routing helps ensure the best model is used for each step to maintain high accuracy.
“Agents are like the killer use case for routing,” Upadhyay said. “It’s a case in which you really, really care about getting steps right, otherwise you have this cascade of failures afterwards.”
Source link