Mid Sweden University

miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Lightweight Neural Network for Monocular View Generation with Occlusion Handling
2021 (English)In: IEEE Transactions on Pattern Analysis and Machine Intelligence, ISSN 0162-8828, E-ISSN 1939-3539, Vol. 43, no 6, p. 1832-1844, article id 8936903Article in journal (Refereed) Published
Abstract [en]

In this article, we present a very lightweight neural network architecture, trained on stereo data pairs, which performs view synthesis from one single image. With the growing success of multi-view formats, this problem is indeed increasingly relevant. The network returns a prediction built from disparity estimation, which fills in wrongly predicted regions using a occlusion handling technique. To do so, during training, the network learns to estimate the left-right consistency structural constraint on the pair of stereo input images, to be able to replicate it at test time from one single image. The method is built upon the idea of blending two predictions: a prediction based on disparity estimation and a prediction based on direct minimization in occluded regions. The network is also able to identify these occluded areas at training and at test time by checking the pixelwise left-right consistency of the produced disparity maps. At test time, the approach can thus generate a left-side and a right-side view from one input image, as well as a depth map and a pixelwise confidence measure in the prediction. The work outperforms visually and metric-wise state-of-the-art approaches on the challenging KITTI dataset, all while reducing by a very significant order of magnitude (5 or 10 times) the required number of parameters (6.5 M). © 1979-2012 IEEE.

Place, publisher, year, edition, pages
IEEE Computer Society , 2021. Vol. 43, no 6, p. 1832-1844, article id 8936903
Keywords [en]
Computer vision, deep learning, monocular, stereo, view synthesis, Forecasting, Network architecture, Confidence Measure, Direct minimization, Disparity estimations, Monocular view, Occlusion handling, Prediction-based, State-of-the-art approach, Structural constraints, Neural networks
Identifiers
URN: urn:nbn:se:miun:diva-43447DOI: 10.1109/TPAMI.2019.2960689ISI: 000649590200002Scopus ID: 2-s2.0-85105887287OAI: oai:DiVA.org:miun-43447DiVA, id: diva2:1603733
Available from: 2021-10-18 Created: 2021-10-18 Last updated: 2021-10-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Guillemot, C.
In the same journal
IEEE Transactions on Pattern Analysis and Machine Intelligence

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 10 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf