3D Gaussian Splatting: Research Status and Challenges in Scene Reconstruction

doi:10.12146/j.issn.2095-3135.20241127002

3D Gaussian Splatting: Research Status and Challenges in Scene Reconstruction
DOI:
                        10.12146/j.issn.2095-3135.20241127002
                    
CSTR:
                        
                    
Author:
                        Zhu DonglinZhu Donglin
School of Automation and Intelligence, Beijing Jiaotong University
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Chen MiaoChen Miao
School of Automation and Intelligence, Beijing Jiaotong University
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Mao YuyanMao Yuyan
School of Automation and Intelligence, Beijing Jiaotong University
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Zhang JunhaoZhang Junhao
School of Automation and Intelligence, Beijing Jiaotong University
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
Wang ZhongliWang Zhongli
School of Automation and Intelligence, Beijing Jiaotong University
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:School of Automation and Intelligence, Beijing Jiaotong University
Clc Number:TP391
Fund Project:National Science and Technology Innovation 2030 (STI2030) Major Projects under Grant 2022ZD0205005 and in part by the Fundamental Research Funds for the Central Universities under Grant 2023YJS142.

Article

Figures

Metrics

Reference [51]

Cited by [0]

Materials

Comments

Abstract:

3D scene reconstruction is a critical research topic in autonomous driving, robotics, and related fields, with extensive applications in navigation mapping, environmental interaction, and virtual/augmented reality tasks. Current deep learning-based reconstruction methods can be primarily categorized into five groups from the perspectives of scene representation and core modeling techniques: cost volume-based depth estimation methods, truncated signed distance function (TSDF)-based voxel approaches, transformer architecture-based large-scale feedforward methods, multilayer perceptron (MLP)-based neural radiance fields (NeRF), and 3D Gaussian Splatting (3DGS). Each category exhibits unique strengths and limitations. The emerging 3DGS method distinguishes itself by explicitly representing scenes through Gaussian functions while achieving rapid scene rendering and novel view synthesis via efficient rasterization operations. Its most significant advantage lies in diverging from NeRF''s MLP-based scene representation paradigm - 3DGS ensures both efficient rendering and interpretable editable scene modeling, thereby paving the way for accurate 3D scene reconstruction. However, 3DGS still faces numerous challenges in practical scene reconstruction applications. Based on this analysis, this paper first provides a concise introduction to 3DGS fundamentals and conducts comparative analysis with the aforementioned four categories. Following a systematic survey of existing 3DGS reconstruction algorithms, we summarize the key challenges addressed by these methods and review current research progress on core technical difficulties through representative case studies. Finally, we prospect potential future research directions worthy of exploration.

Key words:Depth Estimation; 3D Gaussian Splatting; Neural Radiance Field; 3-D Scene Reconstruction

Reference

[1] Wang K, Shen S. MVDepthNet: Real-time Multiview Depth Estimation Neural Network[Z/OL]. arXiv preprint arXiv: 1807.08563, 2018.

[2] Yao Y, Luo Z, Li S, et al. MVSNet: Depth Inference for Unstructured Multi-view Stereo[Z/OL]. arXiv preprint arXiv: 1804.02505, 2018.

[3] Kim T, Choi J, Choi S, et al. Just a Few Points are All You Need for Multi-view Stereo: A Novel Semi-supervised Learning Method for Multi-view Stereo[C] //2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021: 6158-6166.

[4] Wang S, Leroy V, et al. DUSt3R: Geometric 3D Vision Made Easy[C] //2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024: 20697-20709.

[5] Wang H, Agapito, L. 3D Reconstruction with Spatial Memory[Z/OL]. arXiv preprint arXiv: 2408.16061, 2024.

[6] Mildenhall B, Srinivasan PP, Tancik M, et al. Nerf: Representing scenes as neural radiance fields for view synthesis[J]. Communications of the ACM, 2021, 65(1): 99-106.

[7] Kerbl B, Kopanas G, Leimkuhler T, et al. 3d gaussian splatting for real-time radiance field rendering[J]. ACM Transactions on Graphics, 2023, 42(4): 1-14.

[8] Barron JT, Mildenhall B, Verbin D, et al. Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields[C] //2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022: 5460-5469.

[9] Arno K, Jaesik P, QY, et al. 2017. Tanks and temples: benchmarking large-scale scene reconstruction[J]. ACM Transactions on Graphics, 2017, 36(4): 1-13.

[10] Peter H, Julien P, True P, et al. 2018. Deep blending for free-viewpoint image-based rendering[J]. ACM Transactions on Graphics. 2018, 37(6): 1-15.

[11] Barron JT, Mildenhall B, Tancik M, et al. Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields[C] //2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021: 5835-5844.

[12] Thomas M, Alex E, Christoph S, et al. Instant neural graphics primitives with a multiresolution hash encoding[J]. ACM Transactions on Graphics, 2022, 41(4): 1-15.

[13] Niedermayr S, Stumpfegger J, Westermann, R. Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis[Z/OL]. arXiv preprint arXiv:2401.02436, 2023.

[14] Lee JC, Rho D, Sun X, et al. Compact 3D Gaussian Representation for Radiance Field[Z/OL]. arXiv preprint arXiv:2311.13681, 2023.

[15] Fan Z, Wang K, Wen K, et al. LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS[Z/OL]. arXiv preprint arXiv:2311.17245, 2023.

[16] Lu T. Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering[Z/OL]. arXiv preprint arXiv:2312.00109, 2023.

[17] Chen Y, Wu Q, Lin W, et al. HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression[Z/OL]. arXiv preprint arXiv:2403.14530, 2024.

[18] Morgenstern W, Barthel F, Hilsmann A, et al. Compact 3D Scene Representation via Self-Organizing Gaussian Grids[Z/OL]. arXiv preprint arXiv:2312.13299, 2023.

[19] Gong Y. GDGS: Gradient Domain Gaussian Splatting for Sparse Representation of Radiance Fields[Z/OL]. arXiv preprint arXiv:2405.05446, 2024.

[20] Xie H, Chen Z, Hong F, et al. GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation[Z/OL]. arXiv preprint arXiv:2406.06526, 2024.

[21] Yu Z. SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior[Z/OL]. arXiv preprint arXiv:2403.20079, 2024.

[22] Gao Y, Ou J, Wang L, et al. Bootstrap 3D Reconstructed Scenes from 3D Gaussian Splatting[Z/OL]. arXiv preprint arXiv:2404.18669, 2024.

[23] Wu K. HGS-Mapping: Online Dense Mapping Using Hybrid Gaussian Representation in Urban Scenes[Z/OL]. arXiv preprint arXiv:2403.20159, 2024.

[24] Gao J. Relightable 3D Gaussians: Realistic Point Cloud Relighting with BRDF Decomposition and Ray Tracing[Z/OL]. arXiv preprint arXiv:2311.16043, 2023.

[25] Blanc H, Deschaud JE, Paljic A. RayGauss: Volumetric Gaussian-Based Ray Casting for Photorealistic Novel View Synthesis[Z/OL]. arXiv preprint arXiv:2408.03356, 2024.

[26] Moenne-Loccoz N. 3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes[Z/OL]. arXiv preprint arXiv:2407.07090, 2024.

[27] Jiang Y. GaussianShader: 3D Gaussian Splatting with Shading Functions for Reflective Surfaces[Z/OL]. arXiv preprint arXiv:2311.17977, 2023.

[28] Meng J. Mirror-3DGS: Incorporating Mirror Reflections into 3D Gaussian Splatting[Z/OL]. arXiv preprint arXiv:2404.01168, 2024.

[29] Lee B, Lee H, Sun X, et al. Deblurring 3D Gaussian Splatting[Z/OL]. arXiv preprint arXiv:2401.00834, 2024.

[30] Lee J, Kim D, Lee D, et al. CRiM-GS: Continuous Rigid Motion-Aware Gaussian Splatting from Motion Blur Images[Z/OL]. arXiv preprint arXiv:2407.03923, 2024.

[31] Guo Y, Hu L, Ma L, et al. SpikeGS: Reconstruct 3D scene via fast-moving bio-inspired sensors[Z/OL]. arXiv preprint arXiv:2407.03771, 2024.

[32] Zhang J, Chen K, Chen S, et al. SpikeGS: 3D Gaussian Splatting from Spike Streams with High-Speed Camera Motion[Z/OL]. arXiv preprint arXiv:2407.10062, 2024.

[33] Weng Y, Shen Z, Chen R, et al. EaDeblur-GS: Event assisted 3D Deblur Reconstruction with Gaussian Splatting[Z/OL]. arXiv preprint arXiv:2407.13520, 2024.

[34] Chung J, Lee S, Nam H, et al. LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes[Z/OL]. arXiv preprint arXiv:2311.13384, 2023.

[35] Ouyang H, Heal K, Lombardi S, et al. Text2Immersion: Generative Immersive Scene with 3D Gaussians[Z/OL]. arXiv preprint arXiv:2312.09242, 2023.

[36] Zhou S. DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting[Z/OL]. arXiv preprint arXiv:2404.06903, 2024.

[37] Ma Y, Zhan D, Jin, Z. FastScene: Text-Driven Fast 3D Indoor Scene Generation via Panoramic Gaussian Splatting[Z/OL]. arXiv preprint arXiv:2405.05768, 2024.

[38] Zhou X. GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting[Z/OL]. arXiv preprint arXiv:2402.07207, 2024.

[39] Rombach R, Blattmann A, Lorenz D, et al. High-Resolution Image Synthesis with Latent Diffusion Models[C] //2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022: 10674-10685.

[40] Farooq BS, Birkl R, Wofk D, et al. ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth[Z/OL]. arXiv preprint arXiv: 2302.12288, 2023.

[41] Li D, Li J, Le H, et al. LAVIS: A Library for Language-Vision Intelligence[Z/OL]. arXiv preprint arXiv: 2209.09019, 2022.

[42] Zhang L, Rao A, Agrawala M. Adding Conditional Control to Text-to-Image Diffusion Models[C] //2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023: 3813-3824.

[43] Lin XQ, He JW, Chen ZY, et al. Diffbir: Towards blind image restoration with generative diffusion prior[Z/OL]. arXiv preprint arXiv: 2308.15070, 2023.

[44] Bar-Tal O, Yariv L, Lipman Y, et al. MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation[C] //Proceedings of Machine Learning Research, 2023: 1737-1752.

[45] Yang Z, Wang J, Li L, et al. Idea2img: Iterative self-refinement with gpt-4v (ision) for automatic image design and generation[Z/OL]. arXiv preprint arXiv: 2310.08541, 2023.

[46] Ranftl R, Bochkovskiy A, Koltun V. Vision transformers for dense prediction[C] //2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021: 12179-12188.

[47] Feng M, Liu J, Cui M, et al. Diffusion360: Seamless 360 degree panoramic image generation based on diffusion models[Z/OL]. arXiv preprint arXiv: 2311.13141, 2023.

[48] Yun I, Shin C, Lee H, et al. EGformer: Equirectangular Geometry-biased Transformer for 360 Depth Estimation[C] //2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023: 6078-6089.

[49] Zeng Y, Fu J, Chao H, et al. Aggregated Contextual Transformations for High-Resolution Image Inpainting[J]. IEEE Transactions on Visualization and Computer Graphics, 2023, 29(7): 3266-3280.

[50] Shi Y, Wang P, Ye J, et al. Mvdream: Multi-view diffusion for 3d generation[Z/OL]. arXiv preprint arXiv: 2308.16512, 2023.

[51] Zoomers B, Wijnants M, Molenaers I, et al. PRoGS: Progressive Rendering of Gaussian Splats[Z/OL]. arXiv preprint arXiv: 2409.01761, 2024.

Get Citation

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:November 27,2024
Revised:March 13,2025
Adopted:March 14,2025
Online: March 18,2025
Published:

Home

About Journal

Editorial Team

Author Center

Peer Review

Reader Center

Ethics

Contact us

中文

Get Citation

Share

Article Metrics

History

Article QR Code