Blog

Blog sub title

Robust 3D Mapping and Adaptive Navigation
Robust 3D Mapping and Adaptive Navigation

We study robust 3D mapping, geometry-aware learning, and adaptive navigation in dynamic settings.

Video-Language Grounding for Open-World Agents
Video-Language Grounding for Open-World Agents

We study video-language grounding and open-world recognition for long-term autonomous agents.

Foundation Models for Robotic Manipulation
Foundation Models for Robotic Manipulation

We study foundation models for robotic manipulation, task abstraction, and reusable control primitives.

Uncertainty Estimation and Robust Deployment
Uncertainty Estimation and Robust Deployment

We study uncertainty estimation, risk-sensitive inference, and robust deployment under distribution shift.

Interactive 3D World Models for Embodied Training
Interactive 3D World Models for Embodied Training

We study interactive 3D world models for simulation, prediction, and scalable embodied training pipelines.

Latest Posts

GSMem: 3D Gaussian Splatting as Persistent Spatial Memory for Zero-Shot Embodied Exploration and Reasoning

We utilize 3DGS serves as a persistent spatial memory for embodied navigation, enabling the agent to ‘‘hallucinate’’ optimal views for high-fidelity Vision-Language Model (VLM) reasoning.

Spatial Memory for Long-Horizon Embodied Agents
Spatial Memory for Long-Horizon Embodied Agents

We are prototyping placeholder systems that combine spatial memory, semantic retrieval, and planning to support embodied agents acting over long horizons.

Visual Understanding Benchmark for Open-World Scenes
Visual Understanding Benchmark for Open-World Scenes

We are building a placeholder benchmark suite for evaluating open-world visual understanding across long-tail scene categories, ambiguous contexts, and multimodal evidence.