One paper been accepted by IEEE TMM 🎉
Title: 3D Semantic Gaussian via Geometric-Semantic Hypergraph Computation
Semantic labels are inherently tied to geometry and luminance reconstruction, as entities with similar shapes and appearances often share categories. Traditional methods use synthesis-analysis, NeRF, or 3D Gaussian representations to encode semantics and geometry separately. However, 2D methods lack view consistency, NeRF extensions are slow, and faster 3D Gaussian methods risk spatial and channel inconsistencies between semantic and RGB. Moreover, these methods require costly manual dense semantic labels. To alleviate resource demands and achieve effective semantic reconstruction with sparse inputs while enhancing RGB rendering quality, we build upon 3D Gaussian by integrating semantic features from pre-trained models—requiring no additional ground truth input—into Gaussian features, and construct a hypergraph neural network to capture higher-order correlations across RGB and semantic information as well as between different frames. Hypergraphs use hyperedges to link multiple vertices, capturing complex relationships essential for cross-modal tasks. This higher-order structure addresses the limitations of NeRF and Gaussian methods, which lack the capacity for such advanced associations. This framework enables precise novel view synthesis and 2D semantic reconstruction without manual annotations, achieving state-of-the-art results for RGB and semantic tasks on room-scale scenes in the ScanNet and Replica datasets, while supporting real-time rendering speeds of 34 FPS.
