Blog

Blog sub title

BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting
BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting

BARD-GS is a novel approach for robust dynamic scene reconstruction that effectively handles blurry inputs and imprecise camera poses.

CAUSAL3D: A Comprehensive Benchmark for Causal Learning from Visual Data
CAUSAL3D: A Comprehensive Benchmark for Causal Learning from Visual Data

A 19-dataset benchmark evaluating causal reasoning capabilities in vision models, revealing that performance drops sharply as causal complexity increases.

NEBULA: A Unified Framework for Evaluating Embodied AI Systems

A unified ecosystem for evaluating embodied AI systems beyond coarse task success metrics, combining capability tests for fine-grained skill diagnosis and stress tests for robustness under real-world perturbations.

RT-LTP: Real-Time Latent Trajectory Prediction via Efficient Online Adaptation
RT-LTP: Real-Time Latent Trajectory Prediction via Efficient Online Adaptation

A real-time trajectory prediction framework that reformulates forecasting as a latent-space alignment problem. Achieves up to 54% faster adaptation and 9.9% higher accuracy across multiple autonomous driving benchmarks.

View-consistent Object Removal in Radiance Fields
View-consistent Object Removal in Radiance Fields

we introduce a novel RF editing pipeline that significantly enhances consistency by requiring the inpainting of only a single reference image. This image is then propagated across multiple views using a depth-based approach, to maintain consistencies.

Latest Posts

GSMem: 3D Gaussian Splatting as Persistent Spatial Memory for Zero-Shot Embodied Exploration and Reasoning

We utilize 3DGS serves as a persistent spatial memory for embodied navigation, enabling the agent to ‘‘hallucinate’’ optimal views for high-fidelity Vision-Language Model (VLM) reasoning.

GSMem: 3D Gaussian Splatting as Persistent Spatial Memory for Zero-Shot Embodied Exploration and Reasoning

We utilize 3DGS serves as a persistent spatial memory for embodied navigation, enabling the agent to ‘‘hallucinate’’ optimal views for high-fidelity Vision-Language Model (VLM) reasoning.

Reconstruction Matters: Learning Geometry-Aligned BEV Representation through 3D Gaussian Splatting

we propose Splat2BEV, a Gaussian Splatting-assisted BEV perception framework that aims to learn BEV feature representations that are both semantically rich and geometrically precise.