We propose Segment then Splat, an Open-vocabulary 3D segmentation method that reverses the long established approach of “segmentation after reconstruction” by dividing Gaussians into distinct object sets before reconstruction.
Praxis-VLM studies vision-grounded decision making through text-driven reinforcement learning, connecting visual observations with language-guided policy learning.
We propose Noise Guided Splatting, a method that handles the inherit “false transparency” artifact in 3DGS by injecting opaque noise Gaussians in the object volume during training, the object surfaces are encourages surface Gaussians to adopt higher opacity.
A comprehensive survey addressing how VLMs currently lack spatial intelligence, covering recent advances, taxonomies, and evaluations toward building spatially intelligent AI.
A comprehensive survey addressing how VLMs currently lack spatial intelligence, covering recent advances, taxonomies, and evaluations toward building spatially intelligent AI.