miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Gaining Depth: Time-of-Flight Sensor Fusion for Three-Dimensional Video Content Creation
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems. (Realistic3D)ORCID iD: 0000-0002-2578-7896
2014 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The successful revival of three-dimensional (3D) cinema has generated a great deal of interest in 3D video. However, contemporary eyewear-assisted displaying technologies are not well suited for the less restricted scenarios outside movie theaters. The next generation of 3D displays, autostereoscopic multiview displays, overcome the restrictions of traditional stereoscopic 3D and can provide an important boost for 3D television (3DTV). Then again, such displays require scene depth information in order to reduce the amount of necessary input data. Acquiring this information is quite complex and challenging, thus restricting content creators and limiting the amount of available 3D video content. Nonetheless, without broad and innovative 3D television programs, even next-generation 3DTV will lack customer appeal. Therefore simplified 3D video content generation is essential for the medium's success.

This dissertation surveys the advantages and limitations of contemporary 3D video acquisition. Based on these findings, a combination of dedicated depth sensors, so-called Time-of-Flight (ToF) cameras, and video cameras, is investigated with the aim of simplifying 3D video content generation. The concept of Time-of-Flight sensor fusion is analyzed in order to identify suitable courses of action for high quality 3D video acquisition. In order to overcome the main drawback of current Time-of-Flight technology, namely the high sensor noise and low spatial resolution, a weighted optimization approach for Time-of-Flight super-resolution is proposed. This approach incorporates video texture, measurement noise and temporal information for high quality 3D video acquisition from a single video plus Time-of-Flight camera combination. Objective evaluations show benefits with respect to state-of-the-art depth upsampling solutions. Subjective visual quality assessment confirms the objective results, with a significant increase in viewer preference by a factor of four. Furthermore, the presented super-resolution approach can be applied to other applications, such as depth video compression, providing bit rate savings of approximately 10 percent compared to competing depth upsampling solutions. The work presented in this dissertation has been published in two scientific journals and five peer-reviewed conference proceedings. 

In conclusion, Time-of-Flight sensor fusion can help to simplify 3D video content generation, consequently supporting a larger variety of available content. Thus, this dissertation provides important inputs towards broad and innovative 3D video content, hopefully contributing to the future success of next-generation 3DTV.

Place, publisher, year, edition, pages
Sundsvall: Mittuniversitetet , 2014. , 228 p.
Series
Mid Sweden University doctoral thesis, ISSN 1652-893X ; 185
Keyword [en]
3D video, Time-of-Flight, depth map acquisition, optimization, 3DTV, ToF, upsampling, super-resolution, sensor fusion
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:miun:diva-21938Local ID: STCISBN: 978-91-87557-49-1 (print)OAI: oai:DiVA.org:miun-21938DiVA: diva2:717351
Public defence
2014-06-04, L111, Holmgatan 10, Sundsvall, 10:00 (English)
Opponent
Supervisors
Funder
Knowledge Foundation, 2009/0264
Available from: 2014-05-16 Created: 2014-05-14 Last updated: 2017-03-06Bibliographically approved
List of papers
1. Depth Sensing for 3DTV: A Survey
Open this publication in new window or tab >>Depth Sensing for 3DTV: A Survey
2013 (English)In: IEEE Multimedia, ISSN 1070-986X, E-ISSN 1941-0166, Vol. 20, no 4, 10-17 p.Article in journal (Refereed) Published
Abstract [en]

In the context of 3D video systems, depth information could be used to render a scene from additional viewpoints. Although there have been many recent advances in this area, including the introduction of the Microsoft Kinect sensor, the robust acquisition of such information continues to be a challenge. This article reviews three depth-sensing approaches for 3DTV. The authors discuss several approaches for acquiring depth information and provides a comparative analysis of their characteristics.

Place, publisher, year, edition, pages
IEEE Computer Society, 2013
Keyword
3D video, scene acquisition, capture, depth sensing, stereo analysis, structured lighting, time-of-flight, sensor fusion
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-20416 (URN)10.1109/MMUL.2013.53 (DOI)000327723900007 ()2-s2.0-84890069117 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Projects
Realistic3D
Funder
Knowledge Foundation, 2009/0264
Note

This work has been supportedby grant 2009/0264 of the KK Foundation, Sweden; grant 00156702 of the EU European Regional Development Fund,Mellersta Norrland, Sweden; and grant 00155148 ofLänsstyrelsenVästernorrland, Sweden.

Available from: 2013-12-03 Created: 2013-12-03 Last updated: 2016-10-20Bibliographically approved
2. A Weighted Optimization Approach to Time-of-Flight Sensor Fusion
Open this publication in new window or tab >>A Weighted Optimization Approach to Time-of-Flight Sensor Fusion
2014 (English)In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 23, no 1, 214-225 p.Article in journal (Refereed) Published
Abstract [en]

Acquiring scenery depth is a fundamental task in computer vision, with many applications in manufacturing, surveillance, or robotics relying on accurate scenery information. Time-of-flight cameras can provide depth information in real-time and overcome short-comings of traditional stereo analysis. However, they provide limited spatial resolution and sophisticated upscaling algorithms are sought after. In this paper, we present a sensor fusion approach to time-of-flight super resolution, based on the combination of depth and texture sources. Unlike other texture guided approaches, we interpret the depth upscaling process as a weighted energy optimization problem. Three different weights are introduced, employing different available sensor data. The individual weights address object boundaries in depth, depth sensor noise, and temporal consistency. Applied in consecutive order, they form three weighting strategies for time-of-flight super resolution. Objective evaluations show advantages in depth accuracy and for depth image based rendering compared with state-of-the-art depth upscaling. Subjective view synthesis evaluation shows a significant increase in viewer preference by a factor of four in stereoscopic viewing conditions. To the best of our knowledge, this is the first extensive subjective test performed on time-of-flight depth upscaling. Objective and subjective results proof the suitability of our approach to time-of-flight super resolution approach for depth scenery capture.

Place, publisher, year, edition, pages
IEEE Signal Processing Society, 2014
Keyword
Sensor fusion, range data, time-of-flight sensors, depth map upscaling, 3D video, stereo vision
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-20415 (URN)10.1109/TIP.2013.2287613 (DOI)000329195500017 ()2-s2.0-84888373138 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Projects
Realistic3D
Funder
Knowledge Foundation, 2009/0264
Note

This work was supported in part by the KKFoundation of Sweden under Grant 2009/0264, in part by the EU Euro-pean Regional Development Fund, Mellersta Norrland, Sweden, under Grant 00156702, and in part by Länsstyrelsen Västernorrland, Sweden, under Grant 00155148.

Available from: 2013-12-03 Created: 2013-12-03 Last updated: 2017-03-06Bibliographically approved
3. Depth Map Upscaling Through Edge Weighted Optimization
Open this publication in new window or tab >>Depth Map Upscaling Through Edge Weighted Optimization
2012 (English)In: Proceedings of SPIE - The International Society for Optical Engineering / [ed] Atilla M. Baskurt, Robert Sitnik, SPIE - International Society for Optical Engineering, 2012, Art. no. 829008- p.Conference paper, (Refereed)
Abstract [en]

Accurate depth maps are a pre-requisite in three-dimensional television, e.g. for high quality view synthesis, but this information is not always easily obtained. Depth information gained by correspondence matching from two or more views suffers from disocclusions and low-texturized regions, leading to erroneous depth maps. These errors can be avoided by using depth from dedicated range sensors, e.g. time-of-flight sensors. Because these sensors only have restricted resolution, the resulting depth data need to be adjusted to the resolution of the appropriate texture frame. Standard upscaling methods provide only limited quality results. This paper proposes a solution for upscaling low resolution depth data to match high resolution texture data. We introduce We introduce the Edge Weighted Optimization Concept (EWOC) for fusing low resolution depth maps with corresponding high resolution video frames by solving an overdetermined linear equation system. Similar to other approaches, we take information from the high resolution texture, but additionally validate this information with the low resolution depth to accentuate correlated data. Objective tests show an improvement in depth map quality in comparison to other upscaling approaches. This improvement is subjectively confirmed in the resulting view synthesis.

Place, publisher, year, edition, pages
SPIE - International Society for Optical Engineering, 2012
Keyword
3DTV, depth map, upscaling, time-of-flight, view synthesis, optimization, edge detection
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:miun:diva-15805 (URN)10.1117/12.903921 (DOI)000304302300007 ()2-s2.0-84861935064 (Scopus ID)STC (Local ID)978-081948937-1 (ISBN)STC (Archive number)STC (OAI)
Conference
3-Dimensional Image Processing (3DIP) and Applications II;Burlingame, CA;24 January 2012through26 January 2012;Code90039
Available from: 2012-02-16 Created: 2012-01-31 Last updated: 2016-10-20Bibliographically approved
4. Improved edge detection for EWOC depth upscaling
Open this publication in new window or tab >>Improved edge detection for EWOC depth upscaling
2012 (English)In: 2012 19th International Conference on Systems, Signals and Image Processing, IWSSIP 2012, IEEE conference proceedings, 2012, 1-4 p.Conference paper, (Refereed)
Abstract [en]

The need for accurate depth information in three-dimen-sional television (3DTV) encourages the use of range sensors,i.e. time-of-flight (ToF) cameras. Since these sensors provideonly limited spatial resolution compared to modern high res-olution image sensors, upscaling methods are much needed.Typical depth upscaling algorithms fuse low resolution depthinformation with appropriate high resolution texture frames,taking advantage of the additional texture information in theupscaling process. We recently introduced a promising up-scaling method, utilizing edge information from the textureframe to upscale low resolution depthmaps. This paper exam-ines how a more thorough edge detection can be achieved byinvestigating different edge detection sources, such as inten-sity, color spaces and difference signals. Our findings showthat a combination of sources based on the perceptual quali-ties of the human visual system (HVS) leads to slightly im-proved results. On the other hand these improvements implya more complex edge detection.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2012
Series
Systems, Signals and Image Processing (IWSSIP), ISSN 2157-8672 ; 19
Keyword
3DTV, EWOC, depth map, ToF, upscaling, perceptual edge detection, HVS, CIE2000
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-16210 (URN)2-s2.0-84863949324 (Scopus ID)STC (Local ID)978-1-4577-2191-5 (ISBN)STC (Archive number)STC (OAI)
Conference
2012 19th International Conference on Systems, Signals and Image Processing, IWSSIP 2012;Vienna;11 April 2012through13 April 2012;Category numberCFP1255E-ART;Code91138
Available from: 2012-09-14 Created: 2012-05-14 Last updated: 2016-10-20Bibliographically approved
5. Adaptive depth filtering for HEVC 3D video coding
Open this publication in new window or tab >>Adaptive depth filtering for HEVC 3D video coding
2012 (English)In: 2012 Picture Coding Symposium, PCS 2012, Proceedings, IEEE conference proceedings, 2012, 49-52 p.Conference paper, (Refereed)
Abstract [en]

Consumer interest in 3D television (3DTV) is growing steadily, but current available 3D displays still need additional eye-wear and suffer from the limitation of a single stereo view pair. So it can be assumed that auto-stereoscopic multiview displays are the next step in 3D-at-home entertainment, since these displays can utilize the Multiview Video plus Depth (MVD) format to synthesize numerous viewing angles from only a small set of given input views. This motivates efficient MVD compression as an important keystone for commercial success of 3DTV. In this paper we concentrate on the compression of depth information in an MVD scenario. There have been several publications suggesting depth down- and upsampling to increase coding efficiency. We follow this path, using our recently introduced Edge Weighted Optimization Concept (EWOC) for depth upscaling. EWOC uses edge information from the video frame in the upscaling process and allows the use of sparse, non-uniformly distributed depth values. We exploit this fact to expand the depth down-/upsampling idea with an adaptive low-pass filter, reducing high energy parts in the original depth map prior to subsampling and compression. Objective results show the viability of our approach for depth map compression with up-to-date High-Efficiency Video Coding (HEVC). For the same Y-PSNR in synthesized views we achieve up to 18.5% bit rate decrease compared to full-scale depth and around 10% compared to competing depth down-/upsampling solutions. These results were confirmed by a subjective quality assessment, showing a statistical significant preference for 87.5% of the test cases.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2012
Keyword
3-D displays; 3-D television; 3D video coding; Auto stereoscopic; Bit rates; Coding efficiency; Consumer interests; Depth information; Depth Map; Depth value; Edge information; High energy; Multiview displays; Multiview video; Stereo view; Subjective quality assessments; Test case; Upsampling; Upscaling; Video frame; Viewing angle
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-16211 (URN)10.1109/PCS.2012.6213283 (DOI)000306962400013 ()2-s2.0-84864026988 (Scopus ID)STC (Local ID)978-1-4577-2048-2 (ISBN)STC (Archive number)STC (OAI)
Conference
29th Picture Coding Symposium, PCS 2012;Krakow;7 May 2012through9 May 2012;Code91163
Projects
Realistic3D
Available from: 2012-09-14 Created: 2012-05-14 Last updated: 2016-10-20Bibliographically approved
6. Incremental depth upscaling using an edge weighted optimization concept
Open this publication in new window or tab >>Incremental depth upscaling using an edge weighted optimization concept
2012 (English)In: 3DTV-Conference, 2012, Art. no. 6365429- p.Conference paper, (Refereed)
Abstract [en]

Precise scene depth information is a pre-requisite in three-dimen-sional television (3DTV), e.g. for high quality view synthesis inautostereoscopic multiview displays. Unfortunately, this informa-tion is not easily obtained and often of limited quality. Dedicatedrangesensors, suchastime-of-flight(ToF)cameras, candeliverre-liable depth information where (stereo-)matching fails. Nonethe-less, since these sensors provide only restricted spatial resolution,sophisticated upscaling methods are sought-after, to match depthinformation to corresponding texture frames. Where traditionalupscaling fails, novel approaches have been proposed, utilizingadditional information from the texture for the depth upscalingprocess. We recently proposed the Edge Weighted OptimizationConcept (EWOC) for ToF upscaling, using texture edges for ac-curate depth boundaries. In this paper we propose an importantupdate to EWOC, dividing it into smaller incremental upscalingsteps. We predict two major improvements from this. Firstly, pro-cessing time should be decreased by dividing one big calculationinto several smaller steps. Secondly, we assume an increase inquality for the upscaled depth map, due to a more coherent edgedetection on the video frame. In our evaluations we can showthe desired effect on processing time, cutting down the calculationtime more than in half. We can also show an increase in visualquality, based on objective quality metrics, compared to the origi-nal implementation as well as competing proposals.

Keyword
3DTV, EWOC, DIBR, time-of-flight, depth map, upscaling, edge detection, incremental, optimization, view synthesis
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-17023 (URN)10.1109/3DTV.2012.6365429 (DOI)2-s2.0-84872059517 (Scopus ID)STC (Local ID)978-146734905-5 (ISBN)STC (Archive number)STC (OAI)
Conference
2012 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video, 3DTV-CON 2012;Zurich;15 October 2012through17 October 2012;Category numberCFP1255B-ART;Code94817
Projects
3D video: Capture and Compression for Distribution
Available from: 2012-09-25 Created: 2012-09-19 Last updated: 2016-10-20Bibliographically approved
7. Temporal Consistent Depth Map Upscaling for 3DTV
Open this publication in new window or tab >>Temporal Consistent Depth Map Upscaling for 3DTV
2014 (English)In: Proceedings of SPIE - The International Society for Optical Engineering: Three-Dimensional Image Processing, Measurement (3DIPM), and Applications 2014 / [ed] Atilla M. Baskurt; Robert Sitnik, 2014, Art. no. 901302- p.Conference paper, (Refereed)
Abstract [en]

Precise scene depth information is a pre-requisite in three-dimen-sional television (3DTV), e.g. for high quality view synthesis in autostereoscopic multiview displays. Unfortunately, this information is not easily obtained and often of limited quality. Dedicated range sensors, such as time-of-flight (ToF) cameras, can deliver reliable depth information where (stereo-)matching fails. Nonetheless, since these sensors provide only restricted spatial resolution, sophisticated upscaling methods are sought-after, to match depth information to corresponding texture frames.Where traditional upscaling fails, novel approaches have been proposed, utilizing additional information from the texture for the depth upscaling process. We recently proposed the Edge Weighted Optimization Concept (EWOC) for ToF upscaling, using texture edges for accurate depth boundaries. In this paper we propose an important update to EWOC, dividing it into smaller incremental upscaling steps. We predict two major improvements from this. Firstly, processing time should be decreased by dividing one big calculation into several smaller steps. Secondly, we assume an increase in quality for the upscaled depth map, due to a more coherent edge detection on the video frame.In our evaluations we can show the desired effect on processing time, cutting down the calculation time more than in half. We can also show an increase in visual quality, based on objective quality metrics, compared to the original implementation as well as competing proposals.

Series
Conference volume, 9013
Keyword
3d video, 3DTV, depth aqcuisition, optical flow, depth map upscaling
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-21927 (URN)10.1117/12.2032697 (DOI)000336030900001 ()2-s2.0-84901748474 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Conference
Three-Dimensional Image Processing, Measurement (3DIPM), and Applications 2014; San Francisco, CA; United States; 5 February 2014 through 5 February 2014; Code 105376
Available from: 2014-05-12 Created: 2014-05-12 Last updated: 2017-03-06Bibliographically approved

Open Access in DiVA

fulltext(4968 kB)311 downloads
File information
File name FULLTEXT02.pdfFile size 4968 kBChecksum SHA-512
62a6ddeb3d7d37e13161b231110658fef459ad7b2349b37d4dda3617005079e26d001503a4ae42bcd1277d9dec24145fd81e03bae07da51810e0f7950b391e4c
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Schwarz, Sebastian
By organisation
Department of Information and Communication systems
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 311 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 1096 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf