Abstract:
3D scene reconstruction is a critical research topic in autonomous driving, robotics, and related fields, with extensive applications in navigation mapping, environmental interaction, and virtual/augmented reality tasks. Current deep learning-based 3D scene reconstruction methods can be primarily categorized into 5 groups from the perspectives of scene representation and core modeling techniques: cost volume-based depth estimation methods, truncated signed distance function-based voxel approaches, transformer architecture-based large-scale feedforward methods, multilayer perceptron-based neural radiance fields, and 3D Gaussian splatting. Each category exhibits unique strengths and limitations. The emerging 3D Gaussian splatting method distinguishes itself by explicitly representing scenes through Gaussian functions while achieving rapid scene rendering and novel view synthesis via efficient rasterization operations. 3D Gaussian splatting diverges from the neural radiance fields-based scene representation paradigm. Its most significant advantage is that it ensures both efficient rendering and interpretable, editable scene modeling, thereby paving the way for accurate 3D scene reconstruction. However, 3D Gaussian splatting still faces numerous challenges in practical scene reconstruction applications. Based on this analysis, this paper first provides a concise introduction to the fundamentals of 3D Gaussian splatting and conducts a comparative analysis with the aforementioned 4 categories. Following a systematic survey of existing 3D Gaussian splatting reconstruction algorithms, we summarize the key challenges addressed by these methods and review current research progress on core technical difficulties through representative case studies. Finally, we prospect potential future research directions worthy of exploration.