Page 2 of 4 for Blog

Spatial Intelligence in Vision-Language Models: A Comprehensive Survey

A comprehensive survey addressing how VLMs currently lack spatial intelligence, covering recent advances, taxonomies, and evaluations toward building spatially intelligent AI.

Spatial Intelligence in Vision-Language Models: A Comprehensive Survey

A comprehensive survey addressing how VLMs currently lack spatial intelligence, covering recent advances, taxonomies, and evaluations toward building spatially intelligent AI.

BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting

BARD-GS is a novel approach for robust dynamic scene reconstruction that effectively handles blurry inputs and imprecise camera poses.

Balancing Fidelity and Diversity: Synthetic Data Could Stand on the Shoulder of the Real in Visual Recognition

Investigates how data fidelity and diversity affect recognition performance through synthetic data curation, offering training-free improvements for visual recognition tasks.

Visual Understanding Benchmark for Open-World Scenes

Our lab offers a diverse range of benchmarks, including YesBut, Nebular, Viva, and Causal 3D, focused on advancing open-world visual understanding and robotic interaction.

Subscribe