Reviews 3D reconstruction, including self-supervised, SLAM, and NeRF methods. Our approach uses open-set 2D instance segmentation and RGB-D back-projection for efficient instance-based 3D mapping.Reviews 3D reconstruction, including self-supervised, SLAM, and NeRF methods. Our approach uses open-set 2D instance segmentation and RGB-D back-projection for efficient instance-based 3D mapping.

Semantic Geometry Completion and SLAM Integration in 3D Mapping

2025/12/11 02:00
3 min di lettura
Per feedback o dubbi su questo contenuto, contattateci all'indirizzo crypto.news@mexc.com.

Abstract and 1 Introduction

  1. Related Works

    2.1. Vision-and-Language Navigation

    2.2. Semantic Scene Understanding and Instance Segmentation

    2.3. 3D Scene Reconstruction

  2. Methodology

    3.1. Data Collection

    3.2. Open-set Semantic Information from Images

    3.3. Creating the Open-set 3D Representation

    3.4. Language-Guided Navigation

  3. Experiments

    4.1. Quantitative Evaluation

    4.2. Qualitative Results

  4. Conclusion and Future Work, Disclosure statement, and References

2.3. 3D Scene Reconstruction

In recent times, 3D scene reconstruction has seen significant advancements. Some recent works in this field include using a self-supervised approach for Semantic Geometry completion and appearance reconstruction from RGB-D scans such as [26], which uses 3D encoder-decoder architecture for geometry and colour. For these approaches, the focus is on generating semantic reconstruction without ground truth. Another approach is to integrate real-time 3D reconstruction with SLAM. This is done through keyframe-based techniques and has been used in recent autonomous navigation and AR use cases[27]. Another recent method has seen work on Neural Radiance Fields[28] for indoor spaces when utilizing structure-from-motion to understand camera-captured scenes. These NeRF models are trained for each location and are particularly good for spatial understanding. Another method is to build 3D scene graphs using open vocabulary and foundational models like CLIP to capture semantic relationships between objects and their visual representations[4]. During reconstruction, they use the features extracted from the 3D point clouds and project them onto the embedding space learned by CLIP.

\ This work uses an open-set 2D instance segmentation method, as explained in the previous sections. Given an RGB-D image, we get these individual object masks from the RGB image and back-project them to 3D using the Depth image. Here, we have an instance-based approach instead of having a point-by-point computation to reconstruct, which was previously done by Concept-Fusion [29]. This per-object feature mask extraction also helps us compute embeddings, which preserve the open-set nature of this pipeline.

\

:::info Authors:

(1) Laksh Nanwani, International Institute of Information Technology, Hyderabad, India; this author contributed equally to this work;

(2) Kumaraditya Gupta, International Institute of Information Technology, Hyderabad, India;

(3) Aditya Mathur, International Institute of Information Technology, Hyderabad, India; this author contributed equally to this work;

(4) Swayam Agrawal, International Institute of Information Technology, Hyderabad, India;

(5) A.H. Abdul Hafez, Hasan Kalyoncu University, Sahinbey, Gaziantep, Turkey;

(6) K. Madhava Krishna, International Institute of Information Technology, Hyderabad, India.

:::


:::info This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.

:::

\

Opportunità di mercato
Logo OpenLedger
Valore OpenLedger (OPEN)
$0.14434
$0.14434$0.14434
-1.56%
USD
Grafico dei prezzi in tempo reale di OpenLedger (OPEN)
Disclaimer: gli articoli ripubblicati su questo sito provengono da piattaforme pubbliche e sono forniti esclusivamente a scopo informativo. Non riflettono necessariamente le opinioni di MEXC. Tutti i diritti rimangono agli autori originali. Se ritieni che un contenuto violi i diritti di terze parti, contatta crypto.news@mexc.com per la rimozione. MEXC non fornisce alcuna garanzia in merito all'accuratezza, completezza o tempestività del contenuto e non è responsabile per eventuali azioni intraprese sulla base delle informazioni fornite. Il contenuto non costituisce consulenza finanziaria, legale o professionale di altro tipo, né deve essere considerato una raccomandazione o un'approvazione da parte di MEXC.