Skip to main content
We’re building the perception layer for AI agents - giving machines eyes and ears. If you’re working on agentic systems or physical AI, we’d love to collaborate.

Areas of Interest

Agentic AI Systems

Frameworks for agents to work together - sharing context, delegating tasks, and coordinating actions across video, audio, and text modalities.
How agents remember, recall, and reason about continuous media. Building long-term memory systems that work with video streams and recordings.
Agents that watch, understand, and act on video content - from surveillance feeds to screen recordings to live streams.
Enabling agents to take meaningful actions based on what they see and hear - editing, annotating, searching, and generating video content.

Physical AI & Embodied Agents

Vision systems for robots and autonomous machines. Processing real-world video streams for navigation, manipulation, and interaction.
Training agents that learn from video demonstrations. Sim-to-real transfer and video-based imitation learning.
Taking agentic systems from lab to production. Handling edge cases, failures, and real-world complexity.

Video Understanding for Agents

Identifying scenes, activities, and events in video streams. Creating coherent segments for agent reasoning.
Understanding causality, sequences, and time in video. What happened, what’s happening, what might happen next.
Retrieval systems that work across video, audio, and text. Finding relevant moments and context for agent decision-making.

Code & Model Development

Training and fine-tuning video models. Managing terabytes of training data, annotations, and evaluation pipelines.
Evaluating vision and video models. Pushing beyond current benchmarks to test real-world agent capabilities.
Systems that watch tutorials, demos, or documentation and generate working code.

Let’s Collaborate

If you’re working on agentic AI or physical AI and want to collaborate, reach out. We can provide:
  • Infrastructure support for managing video data at scale
  • API access for video understanding and manipulation
  • Technical collaboration with our team on shared research