Blog

Blog sub title

Multimodal Retrieval and Spatial Memory Indexing
Multimodal Retrieval and Spatial Memory Indexing

We study efficient multimodal retrieval, memory indexing, and semantic search over spatial experiences.

Human-Robot Collaboration and Preference Alignment
Human-Robot Collaboration and Preference Alignment

We study human-robot collaboration, preference alignment, and adaptive interfaces for embodied AI systems.

Scalable Policy Learning for Mobile Platforms
Scalable Policy Learning for Mobile Platforms

We study scalable policy learning for mobile platforms operating across weather, terrain, and sensing conditions.

Creating a post series
Creating a post series

From version 0.12, you can now make a post a part of a series of posts, linking to the other posts in the series, by creating a series data file and then setting the series in each of the post’s front matter.

Creating a docs site with Bulma Clean Theme
Creating a docs site with Bulma Clean Theme

I created Bulma Clean Theme as a theme for my own website and decided to open source it so others could use it as well. One of the key things I wanted to do was to create a theme that worked with GitHub Pages, which also means that you can also use it as a docs site for your project.

Latest Posts

GSMem: 3D Gaussian Splatting as Persistent Spatial Memory for Zero-Shot Embodied Exploration and Reasoning

We utilize 3DGS serves as a persistent spatial memory for embodied navigation, enabling the agent to ‘‘hallucinate’’ optimal views for high-fidelity Vision-Language Model (VLM) reasoning.

Spatial Memory for Long-Horizon Embodied Agents
Spatial Memory for Long-Horizon Embodied Agents

We are prototyping placeholder systems that combine spatial memory, semantic retrieval, and planning to support embodied agents acting over long horizons.

Visual Understanding Benchmark for Open-World Scenes
Visual Understanding Benchmark for Open-World Scenes

We are building a placeholder benchmark suite for evaluating open-world visual understanding across long-tail scene categories, ambiguous contexts, and multimodal evidence.