Ex-Googlers are building infrastructure to help companies understand their video data

Former Google engineers are building AI infrastructure that helps companies analyse, search, and extract insights from large volumes of video data.

Feb 12, 2026 - 16:09
Feb 12, 2026 - 19:45
 2
Ex-Googlers are building infrastructure to help companies understand their video data
Image Credits: InfiniMind

Companies today are producing unprecedented volumes of video. From decades of broadcast archives to thousands of in-store surveillance cameras and vast libraries of production footage, enormous amounts of content are being stored but rarely reviewed or analysed. This accumulation of unused material represents what many describe as dark data — information that organisations automatically collect yet seldom convert into meaningful insights.

Recognising this gap, Aza Kai (CEO) and Hiraku Yanagita (COO), two former Google colleagues who collaborated for nearly ten years at Google Japan, set out to address it. The pair founded InfiniMind, a Tokyo-based startup focused on building infrastructure that transforms petabytes of unseen video and audio into structured, searchable, and business-ready data.

Kai explained that both founders observed the coming shift while still at Google Japan, where Yanagita led brand and data solutions, and Kai worked across cloud computing, machine learning, advertising systems, and video recommendation technologies before eventually heading data science teams. According to Kai, by 202, the underlying technology had progressed sufficiently, and market demand was strong enough to justify establishing the company independently.

Historically, available solutions required compromise. Earlier technologies could identify objects within individual frames but struggled to follow narratives, interpret causality, or respond to more complex queries about video content. For organisations holding decades of broadcast archives and enormous volumes of footage, even answering straightforward questions about what existed in their libraries proved difficult.

The turning point came with advancements in vision-language models between 2021 and 2023. During this period, video AI began advancing beyond basic object recognition. While declining GPU costs and steady annual performance improvements—estimated at 15% to 20% over the past decade—played a role, Kai emphasised that the real transformation was in model capability. Until recently, the available systems were not powerful enough to deliver meaningful results at scale.

InfiniMind has secured $5.8 million in seed funding, with UTEC leading the round and participation from CX2, Headline Asia, Chiba Dojo, and an AI researcher affiliated with a16z Scout. The company plans to relocate its headquarters to the United States while maintaining its presence in Japan. Japan served as an ideal testing environment, offering strong hardware infrastructure, skilled engineering talent, and a supportive startup ecosystem. This allowed the team to refine its technology alongside demanding enterprise customers before expanding internationally.

The company’s first product, TV Pulse, launched in Japan in April 2025. The platform leverages AI to analyse television broadcasts in real time, enabling media and retail clients to monitor product placements, brand visibility, audience sentiment, and the overall impact of public relations efforts. Following pilot initiatives with major broadcasters and agencies, the product has already attracted paying customers, including wholesalers and media organisations.

InfiniMind is now preparing for broader global expansion. Its primary offering, DeepFrame, is a long-form video intelligence system that processes up to 200 hours of footage to identify precise scenes, speakers, or events within that content. A beta version is scheduled for release in March, with a full commercial launch planned for April 2026, according to Kai.

The market for video analysis remains highly segmented. Kai noted that companies like TwelveLabs offer general-purpose video understanding APIs for a broad range of users, including consumers, prosumers, and enterprises. In contrast, InfiniMind focuses on enterprise-level applications, including monitoring, safety and security, and extracting deeper insights from video archives.

Kai highlighted that the company’s platform requires no client-side coding. Organisations upload their data, and the system processes it to generate actionable insights. Beyond visual analysis, the technology integrates audio and speech-recognition capabilities. It can handle unlimited video duration and is designed with cost efficiency as a key differentiator. While many existing solutions prioritise either accuracy or niche use cases, Kai argued that few effectively address the economic challenges of large-scale video processing.

The newly secured funding will support ongoing development of the DeepFrame model, expansion of engineering infrastructure, recruitment of additional technical talent, and customer acquisition efforts across both Japan and the United States.

Kai described the field as a pathway toward artificial general intelligence (AGI). He emphasised that building systems capable of understanding complex video content is fundamentally about interpreting real-world environments. While industrial and commercial applications remain central to the company’s strategy, the broader ambition is to advance technological capabilities to help people better understand reality and make more informed decisions.

What's Your Reaction?

Like Like 0
Dislike Dislike 0
Love Love 0
Funny Funny 0
Angry Angry 0
Sad Sad 0
Wow Wow 0
Shivangi Yadav Shivangi Yadav reports on startups, technology policy, and other significant technology-focused developments in India for TechAmerica.Ai. She previously worked as a research intern at ORF.