Areas of Interest
Agentic AI Systems
Multi-Agent Collaboration
Multi-Agent Collaboration
Frameworks for agents to work together - sharing context, delegating tasks, and coordinating actions across video, audio, and text modalities.
Agent Memory & Perception
Agent Memory & Perception
How agents remember, recall, and reason about continuous media. Building long-term memory systems that work with video streams and recordings.
Autonomous Video Understanding
Autonomous Video Understanding
Agents that watch, understand, and act on video content - from surveillance feeds to screen recordings to live streams.
Tool Use & Action
Tool Use & Action
Enabling agents to take meaningful actions based on what they see and hear - editing, annotating, searching, and generating video content.
Physical AI & Embodied Agents
Robotics & Perception
Robotics & Perception
Vision systems for robots and autonomous machines. Processing real-world video streams for navigation, manipulation, and interaction.
Embodied Learning
Embodied Learning
Training agents that learn from video demonstrations. Sim-to-real transfer and video-based imitation learning.
Real-World Deployment
Real-World Deployment
Taking agentic systems from lab to production. Handling edge cases, failures, and real-world complexity.
Video Understanding for Agents
Scene & Activity Detection
Scene & Activity Detection
Identifying scenes, activities, and events in video streams. Creating coherent segments for agent reasoning.
Temporal Reasoning
Temporal Reasoning
Understanding causality, sequences, and time in video. What happened, what’s happening, what might happen next.
Multimodal RAG
Multimodal RAG
Retrieval systems that work across video, audio, and text. Finding relevant moments and context for agent decision-making.
Code & Model Development
Open Source Models
Open Source Models
Training and fine-tuning video models. Managing terabytes of training data, annotations, and evaluation pipelines.
Benchmarking
Benchmarking
Evaluating vision and video models. Pushing beyond current benchmarks to test real-world agent capabilities.
Code Generation from Video
Code Generation from Video
Systems that watch tutorials, demos, or documentation and generate working code.
Let’s Collaborate
If you’re working on agentic AI or physical AI and want to collaborate, reach out. We can provide:- Infrastructure support for managing video data at scale
- API access for video understanding and manipulation
- Technical collaboration with our team on shared research