中国科学院半导体研究所

Transformer Based Binocular Disparity Prediction with Occlusion Predict and Novel Full Connection Layers

2022-11-03

Author(s): Liu, Y (Liu, Yi); Xu, XT (Xu, Xintao); Xiang, BJ (Xiang, Bajian); Chen, G (Chen, Gang); Gong, GL (Gong, Guoliang); Lu, HX (Lu, Huaxiang)

Source: SENSORS Volume: 22 Issue: 19 Article Number: 7577 DOI: 10.3390/s22197577 Published: OCT 2022

Abstract: The depth estimation algorithm based on the convolutional neural network has many limitations and defects by constructing matching cost volume to calculate the disparity: using a limited disparity range, the authentic disparity beyond the predetermined range can not be acquired; Besides, the matching process lacks constraints on occlusion and matching uniqueness; Also, as a local feature extractor, a convolutional neural network lacks the ability of global context information perception. Aiming at the problems in the matching method of constructing matching cost volume, we propose a disparity prediction algorithm based on Transformer, which specifically comprises the Swin-SPP module for feature extraction based on Swin Transformer, Transformer disparity matching network based on self-attention and cross-attention mechanism, and occlusion prediction sub-network. In addition, we propose a double skip connection fully connected layer to solve the problems of gradient vanishing and explosion during the training process for the Transformer model, thus further enhancing inference accuracy. The proposed model in this paper achieved an EPE (Absolute error) of 0.57 and 0.61, and a 3PE (Percentage error greater than 3 px) of 1.74% and 1.56% on KITTI 2012 and KITTI 2015 datasets, respectively, with an inference time of 0.46 s and parameters as low as only 2.6 M, showing great advantages compared with other algorithms in various evaluation metrics.

Accession Number: WOS:000867173300001

PubMed ID: 36236675

Author Identifiers:

Author Web of Science ResearcherID ORCID Number

刘, 毅 GXN-1792-2022 0000-0003-3056-7713

eISSN: 1424-8220

Full Text: https://www.mdpi.com/1424-8220/22/19/7577

Transformer Based Binocular Disparity Prediction with Occlusion Predict and Novel Full Connection Layers

关于我们

下载视频观看

联系方式

通信地址

电话

E-mail

交通地图

友情链接

中华人民共和国科学技术部

中国科学院

中国工程院

国家自然科学基金委员会

中国科学院大学

中国科学技术大学

中国科学院科技产业网

版权所有中国科学院半导体研究所

备案号：京ICP备05085259-1号京公网安备110402500052 中国科学院半导体所声明

Transformer Based Binocular Disparity Prediction with Occlusion Predict and Novel Full Connection Layers

关于我们

联系方式

通信地址

电话

E-mail

友情链接

版权所有 中国科学院半导体研究所

备案号：京ICP备05085259-1号 京公网安备110402500052 中国科学院半导体所声明

document.write(unescape("%3Cspan id='_ideConac' %3E%3C/span%3E%3Cscript src='http://dcs.conac.cn/js/33/000/0000/60430803/CA330000000604308030001.js' type='text/javascript'%3E%3C/script%3E"));

版权所有中国科学院半导体研究所

备案号：京ICP备05085259-1号京公网安备110402500052 中国科学院半导体所声明