Blog

Research updates, lab news, and stories from VU Lab

Viva: Human-Centered Situational Decision-Making
Viva: Human-Centered Situational Decision-Making

VIVA & VIVA+ advances human-centered situational decision-making for multimodal AI, evaluating how models reason about visual context, human values, and action choices.

BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting
BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting

BARD-GS is a novel approach for robust dynamic scene reconstruction that effectively handles blurry inputs and imprecise camera poses.

Balancing Fidelity and Diversity: Synthetic Data Could Stand on the Shoulder of the Real in Visual Recognition
Balancing Fidelity and Diversity: Synthetic Data Could Stand on the Shoulder of the Real in Visual Recognition

Investigates how data fidelity and diversity affect recognition performance through synthetic data curation, offering training-free improvements for visual recognition tasks.

Visual Understanding Benchmark for Open-World Scenes

Our lab offers a diverse range of benchmarks, including YesBut, Nebular, Viva, and Causal 3D, focused on advancing open-world visual understanding and robotic interaction.

BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting
BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting

BARD-GS is a novel approach for robust dynamic scene reconstruction that effectively handles blurry inputs and imprecise camera poses.

Latest Posts

GSMem: 3D Gaussian Splatting as Persistent Spatial Memory for Zero-Shot Embodied Exploration and Reasoning

We utilize 3DGS serves as a persistent spatial memory for embodied navigation, enabling the agent to ‘‘hallucinate’’ optimal views for high-fidelity Vision-Language Model (VLM) reasoning.

GSMem: 3D Gaussian Splatting as Persistent Spatial Memory for Zero-Shot Embodied Exploration and Reasoning

We utilize 3DGS serves as a persistent spatial memory for embodied navigation, enabling the agent to ‘‘hallucinate’’ optimal views for high-fidelity Vision-Language Model (VLM) reasoning.

Reconstruction Matters: Learning Geometry-Aligned BEV Representation through 3D Gaussian Splatting

We propose Splat2BEV, a Gaussian Splatting-assisted BEV perception framework that aims to learn BEV feature representations that are both semantically rich and geometrically precise.