Gimlet Labs tackles AI inference bottleneck with an elegant new approach

Startup Gimlet Labs is addressing the AI inference bottleneck with an efficient, scalable approach that improves performance and reduces compute costs.

Shivangi Yadav

Mar 23, 2026 - 23:57

Gimlet Labs tackles AI inference bottleneck with an elegant new approach

Image Credits: Gimlet labs

Zain Asgar, a Stanford adjunct professor and previously successful founder, has raised $80 million in Series A funding for his new startup, Gimlet Labs, which aims to address one of the biggest challenges in artificial intelligence: inference bottlenecks. Menlo Ventures led the funding round.

Gimlet Labs has developed what it describes as the first “multi-silicon inference cloud,” a software platform designed to run AI workloads across multiple types of hardware simultaneously. Instead of relying on a single type of processor, the system distributes tasks across CPUs, AI-optimised GPUs, and high-memory systems.

According to Asgar, the platform is designed to operate across whatever hardware resources are available. This flexibility is important because different stages of AI processing demand different types of computing power. For example, inference workloads are typically compute-heavy, decoding processes rely more on memory, and tool-related operations are often network-dependent.

As Menlo Ventures partner Tim Tully explained, no single chip currently handles all these requirements efficiently. However, the industry already has a diverse mix of hardware deployed, including both newer chips and repurposed older GPUs. The missing piece, he suggests, is software capable of coordinating these resources effectively — a gap Gimlet Labs is attempting to fill.

The scale of the problem is significant. McKinsey estimates that global data centre spending could reach nearly $7 trillion by 2030 if current trends continue. Despite this massive investment, Asgar notes that existing hardware is often underutilised, with systems operating at only 15% to 30% of their capacity. He argues that this inefficiency results in hundreds of billions of dollars in wasted resources.

To address this, Gimlet Labs has built orchestration software that breaks down AI workloads — particularly those involving agent-based systems — into smaller components. These components can then be executed simultaneously across different hardware types, maximising efficiency.

The company claims its technology can improve inference performance by three to ten times without increasing costs or power consumption. It also says its system can partition AI models themselves, allowing different parts of a model to run on the most suitable hardware architecture.

Gimlet Labs has already formed partnerships with major chip manufacturers, including NVIDIA, AMD, Intel, ARM, Cerebras, and d-Matrix. Its product is available either as standalone software or via an API integrated with its Gimlet Cloud platform.

However, the offering is not aimed at smaller developers. Instead, it is designed for large AI labs and major data centre operators that manage significant computational workloads.

The startup officially launched in October and reported eight-figure revenue, indicating at least $10 million in earnings. Asgar said the company’s customer base has more than doubled in the past four months and now includes both a major AI model developer and a large cloud computing provider, though he declined to name them.

The founding team — Asgar, Michelle Nguyen, Omid Azizi, and Natalie Serrino — previously worked together at Pixie, a startup focused on Kubernetes observability tools. Pixie was acquired by New Relic in 2020, just two months after launching with a $9 million Series A led by Benchmark. Its technology is now part of the open-source Kubernetes ecosystem.

Asgar’s connection with Menlo Ventures’ Tim Tully began with a chance meeting about a year ago. Early angel backing from Stanford professors helped attract broader investor interest. Following the company’s launch, venture capital firms showed strong interest in participating, resulting in an oversubscribed funding round.

Including earlier seed funding, Gimlet Labs has now raised a total of $92 million. Investors include Factory, which led the seed round, as well as Eclipse Ventures, Prosperity7, and Triatomic. The company has also attracted individual backers such as Sequoia’s Bill Coughran, Stanford professor Nick McKeown, former VMware CEO Raghu Raghuram, and Intel CEO Lip-Bu Tan.

Currently, Gimlet Labs employs around 30 people as it continues to scale its technology and expand its presence in the rapidly evolving AI infrastructure market.

Tags:

What's Your Reaction?

Like 0

Dislike 0

Love 0

Funny 0

Angry 0

Sad 0

Wow 0

Shivangi Yadav Shivangi Yadav reports on startups, technology policy, and other significant technology-focused developments in India for TechAmerica.Ai. She previously worked as a research intern at ORF.