OptiDepthNet: A Real-Time Unsupervised Monocular Depth Estimation Network (original) (raw)

References

Liu, F., Shen, C., Lin, G., et al. (2015). Learning depth from single monocular images using deep convolutional neural fields. IEEE Transactions on Pattern Analysis & Machine Intelligence, 38(10), 2024–2039.
Article Google Scholar
Qingbo, Z., & Hongyuan, W. (2010). Block recovery stereo matching algorithm using image segmentation. Journal of Huazhong University of Science and Technology, 38(1), 81–84.
Google Scholar
Zexiao, X., & Zuoqi, Z. (2018). Spatial point localization method based on the motion recovery structure. Progress in Laser and optoelectronics, 55(8), 370–377.
Google Scholar
Cheng, X., Xiaohan, T., Siping, L., et al. (2019). Fast monocular depth estimation methods for embedded platforms. "in chinese", CN110599533A.
Eigen, D., Puhrsch, C., and Fergus, R., (2014). Depth map prediction from a single image using a multi-scale deep network. In Advances in Neural Information Processing Systems (NIPS), pp. 2366–2374.
Eigen, D., Fergus, R. (2014). Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In 2015 IEEE International Conference on Computer Vision (ICCV).
Liu, F., Shen, C., and Lin, G. (2015). Deep convolutional neural fields for depth estimation from a single image. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5162–5170.
Li, B., Shen, C. H., Dai, Y. C., Van, den H. A., and He M Y. (2015). Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston IEEE, 1119–1127 https://doi.org/10.1109/CVPR.2015.7298715.
Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., and Navab, N., (2016). Deeper depth prediction with fully convolutional residual networks. In 2016 Fourth International Conference on 3D Vision (3DV), pp. 239–248.
Cao, Y., Wu, Z., & Shen, C. (2017). Estimating depth from monocular images as classification using deep fully convolutional residual networks. IEEE Transactions on Circuits and Systems for Video Technology.
Garg, R., Vijay Kumar, B. G., Carneiro, G., and Ian, R.,(2016). Unsupervised CNN for single view depth estimation: geometry to the rescue. In Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 740-756.
Godard C, Aodha O M, and Brostow G J. (2017). Unsupervised monocular depth estimation with left-right consistency. In Conference on Computer Vision and Pattern Recognition (CVPR).
Godard, C., Aodha, O. M., Firman, M., et al. (2019). Digging into self-supervised monocular depth estimation. In ICCV.
Tosi, F., Aleotti, F., Poggi, M., and Mattoccia, S., (2019). Learning monocular depth estimation infusing traditional stereo knowledge. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9799–9809. https://doi.org/10.1109/CVPR.2019.01003.
Casser, V., Pirk, S., Mahjourian, R., et al. (2019). Depth prediction without the sensors: leveraging structure for unsupervised learning from monocular videos. In AAAI.
Wang, L., Zhang, J., Wang, Y., et al. (2020). Cliffnet for monocular depth estimation with hierarchical embedding loss. Cham: Springer.
Book Google Scholar
Mancini, M., Costante, G., Valigi, P., et al. (2016). Fast robust monocular depth estimation for obstacle detection with fully convolutional networks. https://doi.org/10.1109/IROS.2016.7759632
Atapour-Ab Arghouei, A., (2018). Real-time monocular depth estimation using synthetic data with domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision & Pattern Recognition. IEEE.
Liu W, Anguelov D, Erhan D, et al. (2016). SSD: single shot multibox detector. In European Conference on Computer Vision.
Redmon, J., Farhadi, A., (2015). YOLOv3: An incremental improvement. arXiv e-prints, 2018.
Technicolor, T., Related, S., Technicolor, T., et al. ImageNet classification with deep convolutional neural networks [50].
Lecun, Y., Denker, J. S., Solla, S. A., Howard, R. E., & Jackel, L. D. (1989). Optimal brain damage. In Advances in Neural Information Processing Systems 2, NIPS Conference, Denver, Colorado, USA, November 27–30, 1989.
Google Scholar
He, K., Zhang, X., Ren, S., et al. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Girshick R. Fast R-CNN. arXiv e-prints, 2015.
Mancini, M., Costante, G., Valigi, P., and Ciarfuglia, T. A., (2016). Fast robust monocular depth estimation for obstacle detection with fully convolutional networks. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4296–4303.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., et al. (2017). MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint https://arxiv.org/pdf/1704.04861.pdf.
Jlab, C., Qlab, C., Rui, C., et al. (2020). MiniNet: An extremely lightweight convolutional neural network for real-time unsupervised monocular depth estimation. ISPRS Journal of Photogrammetry and Remote Sensing, 166, 255–267.
Article Google Scholar
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. Springer International Publishing.
Google Scholar
Chen, L. C., Zhu, Y., Papandreou, G., et al. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In European Conference on Computer Vision. Springer, Cham.
Enkun, C., Yanqing, T., & Jiawei, L. (2020). Calibration error compensation for the stereo measurement system. Applied Optics, 242(06), 46–52.
Google Scholar
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640–651.
Google Scholar
Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612. https://doi.org/10.1109/TIP.2003.819861
Article Google Scholar
Heise, P., Klose, S., Jensen, B., & Knoll, A. (2013). Pm-huber: Patchmatch with huber regularization for stereo matching. In IEEE 2013 IEEE International Conference on Computer Vision (ICCV) - Sydney, Australia, 2013.12.1–2013.12.8, pp. 2360–2367.
Ranjan, A., Jampani, V., Balles, L., Kim, K., Sun, D., Wulffff, J., and Black, M. J., (2019). Competitive collaboration: Joint unsupervised learning of depth, camera motion, optical flow and motion segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12240–12249. https://doi.org/10.1109/CVPR.2019.01252.
Wofk, D., Ma, F., Yang, T-J., Karaman, S., and Sze, V., (2019). FastDepth: Fast monocular depth estimation on embedded systems. In International Conference on Robotics and Automation (ICRA).
Kuznietsov, Y., Stuckler, J., and Leibe, B., (2017). Semi-supervised deep learning for monocular depth map prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6647–6655. https://doi.org/10.1109/CVPR.2017.238.
Zhou T., Brown M., Snavely N, and Lowe, D. G. (2017). Unsupervised learning of depth and ego-motion from video. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.6612–6619. https://doi.org/10.1109/CVPR.2017.700.
Yin, Z., and Shi, J., (2018). Geonet: Unsupervised learning of dense depth, optical flow and camera pose. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vol. 2, 2018, pp. 1983–1992. doi:https://doi.org/10.1109/CVPR.2018.00212.

Download references