宋力争, 林冬云, 彭侠夫, 刘腾飞. 基于深度学习的patch-match双目三维重建[J]. 应用光学, 2022, 43(3): 436-443. DOI: 10.5768/JAO202243.0302003
引用本文: 宋力争, 林冬云, 彭侠夫, 刘腾飞. 基于深度学习的patch-match双目三维重建[J]. 应用光学, 2022, 43(3): 436-443. DOI: 10.5768/JAO202243.0302003
SONG Lizheng, LIN Dongyun, PENG Xiafu, LIU Tengfei. Patch-match binocular 3D reconstruction based on deep learning[J]. Journal of Applied Optics, 2022, 43(3): 436-443. DOI: 10.5768/JAO202243.0302003
Citation: SONG Lizheng, LIN Dongyun, PENG Xiafu, LIU Tengfei. Patch-match binocular 3D reconstruction based on deep learning[J]. Journal of Applied Optics, 2022, 43(3): 436-443. DOI: 10.5768/JAO202243.0302003

基于深度学习的patch-match双目三维重建

Patch-match binocular 3D reconstruction based on deep learning

  • 摘要: 以patch-match为核心的算法在双目立体重建中有着广泛应用,因其具有低内存消耗、重建精度高等优良性能;然而,传统patch-match算法需要有序地对图像中的每一个像素点进行迭代求取最优视差值d,从而导致运行时间较高。为了解决该问题,在传统patch-match算法的基础上引入基于学习的模型作为指导来降低运行时间,提高立体重建精度。利用深度学习模型输出每个像素伴有异方差不确定度的初始视差图,异方差不确定度用于衡量网络模型所预测视差值的准确度;将异方差不确定度和初始视差作为patch-match算法的先验信息;在平面细化步骤中,利用每个像素点的异方差不确定度大小动态调整其搜索区间,实现减少运行时间的目标。在Middlebury数据集上,通过与原有算法比较可知,改进后的算法在运行时间上减少20%,同时,在不连续等区域上的重建精度得到略微提高。

     

    Abstract: The patch-match algorithm has been widely used in binocular stereo reconstruction due to its low memory consumption and high reconstruction accuracy. However, the traditional patch-match algorithm needs to iteratively calculate the optimal disparity d for each pixel of image in an orderly manner, which resulting in a high running time. In order to solve this problem, a learning-based model on the basis of traditional patch-match algorithm as a guide to reduce the running time and improve the accuracy of stereo reconstruction was introduced. First, the deep learning model was used to output the initial disparity map of each pixel with heteroscedastic uncertainty, which was used to measure the accuracy of the disparity predicted by the network model. Then, the heteroscedastic uncertainty and initial disparity were taken as the prior information of patch-match algorithm. Finally, in the plane refinement step, the heteroscedastic uncertainty of each pixel was used to dynamically adjust its search interval to achieve the goal of reducing the running time. On the Middlebury dataset, compared with the original algorithm, the running time of the improved algorithm is reduced by 20%, and the reconstruction accuracy of the discontinuous region is slightly improved.

     

/

返回文章
返回