Multiverse Computing brings compressed AI models into mainstream adoption

Multiverse Computing is accelerating the adoption of compressed AI models, reducing model size and costs while maintaining performance for enterprise and real-world applications.

Shivangi Yadav

Mar 22, 2026 - 11:21

Multiverse Computing brings compressed AI models into mainstream adoption

With private company default rates rising to around 9.2% — the highest level in years — venture capital firm Lux Capital recently advised companies dependent on AI to secure their compute capacity commitments in writing. As financial instability affects the AI supply chain, the firm warned that informal agreements are no longer sufficient.

However, there is an alternative approach: reducing reliance on external compute infrastructure entirely. Smaller AI models that can run directly on local devices — without depending on data centres, cloud providers, or external partners — are becoming increasingly capable. Multiverse Computing is positioning itself at the forefront of this shift.

The Spanish startup has maintained a relatively low profile compared to some of its competitors, but growing demand for efficient AI solutions is bringing it into focus. After compressing models from leading AI companies such as OpenAI, Meta, DeepSeek, and Mistral AI, Multiverse has introduced both an application to demonstrate its compressed models and an API portal that allows developers to access and build with them more broadly.

The company’s CompactifAI app, named after its quantum-inspired compression technology, functions as an AI chat tool similar to platforms like ChatGPT or Mistral’s Le Chat. Users can ask questions and receive responses, but the key distinction lies in its use of a compact model called Gilda, which is small enough to operate locally and offline, according to the company.

For users, this represents a form of edge-based AI in which data remains on the device and does not require an internet connection. However, this capability depends on having sufficient hardware resources, including memory and storage. Older smartphones, including some iPhone models, may not meet these requirements. In such cases, the app automatically switches to cloud-based models through an API. This transition is managed by a system called Ash Nazg — a reference to “The Lord of the Rings.” When this fallback occurs, the privacy advantages of local processing are reduced.

These constraints mean the app is not yet fully suited for widespread consumer adoption, although that may not be its primary purpose. Data from Sensor Tower indicates that CompactifAI recorded fewer than 5,000 downloads in the past month.

Multiverse’s main focus is on enterprise adoption. The company is now launching a self-service API portal that provides developers and businesses with direct access to its compressed models, eliminating the need for platforms like AWS Marketplace.

“The CompactifAI API portal [now] gives developers direct access to compressed models with the transparency and control needed to run them in production,” CEO Enrique Lizaso said in a statement.

A notable feature of the API is real-time usage monitoring. This aligns with one of the key motivations for enterprises exploring smaller models: reducing compute costs. Compared to large language models (LLMs), compact models can offer more efficient performance while lowering operational expenses.

Advancements in smaller models are also improving their capabilities. Recently, Mistral introduced Mistral Small 4, a model designed to handle general chat, coding tasks, agent-based workflows, and reasoning. The company also released Forge, a system that allows businesses to build customised models, including smaller models optimised for specific use cases.

Multiverse’s own developments indicate that the performance gap between compact models and larger LLMs is narrowing. Its latest compressed model, HyperNova 60B 2602, is based on gpt-oss-120b, an OpenAI model with publicly available code. The company claims that the compressed version delivers faster responses at a lower cost than the original, which is particularly valuable for agent-driven coding tasks that involve multi-step processes.

Developing models that are both small enough to run on mobile devices and still effective remains a significant challenge. Apple has approached this by combining on-device and cloud-based AI through Apple Intelligence. Multiverse’s CompactifAI app also uses a hybrid approach, routing certain requests to cloud-based models when necessary. However, the company’s primary goal is to demonstrate the broader advantages of local AI models such as Gilda and their future iterations.

For professionals working in sensitive or mission-critical environments, running AI models locally can enhance privacy and reliability, especially when connectivity is limited or unavailable. These benefits extend to applications such as drones, satellites, and other systems that operate outside traditional network infrastructure.

Multiverse already serves more than 100 global clients, including the Bank of Canada, Bosch, and Iberdrola. Expanding its customer base could support further funding efforts. After raising $215 million in a Series B round last year, the company is now reportedly seeking an additional €500 million in funding at a valuation exceeding €1.5 billion.

What's Your Reaction?

Like 0

Dislike 0

Love 0

Funny 0

Angry 0

Sad 0

Wow 0

Shivangi Yadav Shivangi Yadav reports on startups, technology policy, and other significant technology-focused developments in India for TechAmerica.Ai. She previously worked as a research intern at ORF.