miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Adaptive depth filtering for HEVC 3D video coding
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media. (Realistic3D)ORCID iD: 0000-0002-2578-7896
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media. (Realistic3D)
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media. (Realistic3D)ORCID iD: 0000-0003-3751-6089
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media. (Realistic3D)
2012 (English)In: 2012 Picture Coding Symposium, PCS 2012, Proceedings, IEEE conference proceedings, 2012, 49-52 p.Conference paper, Published paper (Refereed)
Abstract [en]

Consumer interest in 3D television (3DTV) is growing steadily, but current available 3D displays still need additional eye-wear and suffer from the limitation of a single stereo view pair. So it can be assumed that auto-stereoscopic multiview displays are the next step in 3D-at-home entertainment, since these displays can utilize the Multiview Video plus Depth (MVD) format to synthesize numerous viewing angles from only a small set of given input views. This motivates efficient MVD compression as an important keystone for commercial success of 3DTV. In this paper we concentrate on the compression of depth information in an MVD scenario. There have been several publications suggesting depth down- and upsampling to increase coding efficiency. We follow this path, using our recently introduced Edge Weighted Optimization Concept (EWOC) for depth upscaling. EWOC uses edge information from the video frame in the upscaling process and allows the use of sparse, non-uniformly distributed depth values. We exploit this fact to expand the depth down-/upsampling idea with an adaptive low-pass filter, reducing high energy parts in the original depth map prior to subsampling and compression. Objective results show the viability of our approach for depth map compression with up-to-date High-Efficiency Video Coding (HEVC). For the same Y-PSNR in synthesized views we achieve up to 18.5% bit rate decrease compared to full-scale depth and around 10% compared to competing depth down-/upsampling solutions. These results were confirmed by a subjective quality assessment, showing a statistical significant preference for 87.5% of the test cases.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2012. 49-52 p.
Keyword [en]
3-D displays; 3-D television; 3D video coding; Auto stereoscopic; Bit rates; Coding efficiency; Consumer interests; Depth information; Depth Map; Depth value; Edge information; High energy; Multiview displays; Multiview video; Stereo view; Subjective quality assessments; Test case; Upsampling; Upscaling; Video frame; Viewing angle
National Category
Signal Processing
Identifiers
URN: urn:nbn:se:miun:diva-16211DOI: 10.1109/PCS.2012.6213283ISI: 000306962400013Scopus ID: 2-s2.0-84864026988Local ID: STCISBN: 978-1-4577-2048-2 (print)OAI: oai:DiVA.org:miun-16211DiVA: diva2:526703
Conference
29th Picture Coding Symposium, PCS 2012;Krakow;7 May 2012through9 May 2012;Code91163
Projects
Realistic3D
Available from: 2012-09-14 Created: 2012-05-14 Last updated: 2017-08-22Bibliographically approved
In thesis
1. Depth Map Upscaling for Three-Dimensional Television: The Edge-Weighted Optimization Concept
Open this publication in new window or tab >>Depth Map Upscaling for Three-Dimensional Television: The Edge-Weighted Optimization Concept
2012 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

With the recent comeback of three-dimensional (3D) movies to the cinemas, there have been increasing efforts to spread the commercial success of 3D to new markets. The possibility of a 3D experience at home, such as three-dimensional television (3DTV), has generated a great deal of interest within the research and standardization community.

A central issue for 3DTV is the creation and representation of 3D content. Scene depth information plays a crucial role in all parts of the distribution chain from content capture via transmission to the actual 3D display. This depth information is transmitted in the form of depth maps and is accompanied by corresponding video frames, i.e. for Depth Image Based Rendering (DIBR) view synthesis. Nonetheless, scenarios do exist for which the original spatial resolutions of depth maps and video frames do not match, e.g. sensor driven depth capture or asymmetric 3D video coding. This resolution discrepancy is a problem, since DIBR requires accordance between the video frame and depth map. A considerable amount of research has been conducted into ways to match low-resolution depth maps to high resolution video frames. Many proposed solutions utilize corresponding texture information in the upscaling process, however they mostly fail to review this information for validity.

In the strive for better 3DTV quality, this thesis presents the Edge-Weighted Optimization Concept (EWOC), a novel texture-guided depth upscaling application that addresses the lack of information validation. EWOC uses edge information from video frames as guidance in the depth upscaling process and, additionally, confirms this information based on the original low resolution depth. Over the course of four publications, EWOC is applied in 3D content creation and distribution. Various guidance sources, such as different color spaces or texture pre-processing, are investigated. An alternative depth compression scheme, based on depth map upscaling, is proposed and extensions for increased visual quality and computational performance are presented in this thesis. EWOC was evaluated and compared with competing approaches, with the main focus was consistently on the visual quality of rendered 3D views. The results show an increase in both objective and subjective visual quality to state-of-the-art depth map upscaling methods. This quality gain motivates the choice of EWOC in applications affected by low resolution depth.

In the end, EWOC can improve 3D content generation and distribution, enhancing the 3D experience to boost the commercial success of 3DTV.

Place, publisher, year, edition, pages
Sundsvall, Sweden: Mittuniversitetet, 2012. 57 p.
Series
Mid Sweden University licentiate thesis, ISSN 1652-8948 ; 92
Keyword
3d video, 3DTV, video coding, capture, distribution, EWOC, depth map upscaling, time-of-flight
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-17048 (URN)978-91-87103-41-4 (ISBN)
Presentation
2012-11-22, O111, Mittuniversitetet - Holmgatan 10, Sundsvall, 09:00 (English)
Opponent
Supervisors
Available from: 2012-10-22 Created: 2012-09-24 Last updated: 2017-08-22Bibliographically approved
2. Gaining Depth: Time-of-Flight Sensor Fusion for Three-Dimensional Video Content Creation
Open this publication in new window or tab >>Gaining Depth: Time-of-Flight Sensor Fusion for Three-Dimensional Video Content Creation
2014 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The successful revival of three-dimensional (3D) cinema has generated a great deal of interest in 3D video. However, contemporary eyewear-assisted displaying technologies are not well suited for the less restricted scenarios outside movie theaters. The next generation of 3D displays, autostereoscopic multiview displays, overcome the restrictions of traditional stereoscopic 3D and can provide an important boost for 3D television (3DTV). Then again, such displays require scene depth information in order to reduce the amount of necessary input data. Acquiring this information is quite complex and challenging, thus restricting content creators and limiting the amount of available 3D video content. Nonetheless, without broad and innovative 3D television programs, even next-generation 3DTV will lack customer appeal. Therefore simplified 3D video content generation is essential for the medium's success.

This dissertation surveys the advantages and limitations of contemporary 3D video acquisition. Based on these findings, a combination of dedicated depth sensors, so-called Time-of-Flight (ToF) cameras, and video cameras, is investigated with the aim of simplifying 3D video content generation. The concept of Time-of-Flight sensor fusion is analyzed in order to identify suitable courses of action for high quality 3D video acquisition. In order to overcome the main drawback of current Time-of-Flight technology, namely the high sensor noise and low spatial resolution, a weighted optimization approach for Time-of-Flight super-resolution is proposed. This approach incorporates video texture, measurement noise and temporal information for high quality 3D video acquisition from a single video plus Time-of-Flight camera combination. Objective evaluations show benefits with respect to state-of-the-art depth upsampling solutions. Subjective visual quality assessment confirms the objective results, with a significant increase in viewer preference by a factor of four. Furthermore, the presented super-resolution approach can be applied to other applications, such as depth video compression, providing bit rate savings of approximately 10 percent compared to competing depth upsampling solutions. The work presented in this dissertation has been published in two scientific journals and five peer-reviewed conference proceedings. 

In conclusion, Time-of-Flight sensor fusion can help to simplify 3D video content generation, consequently supporting a larger variety of available content. Thus, this dissertation provides important inputs towards broad and innovative 3D video content, hopefully contributing to the future success of next-generation 3DTV.

Place, publisher, year, edition, pages
Sundsvall: Mittuniversitetet, 2014. 228 p.
Series
Mid Sweden University doctoral thesis, ISSN 1652-893X ; 185
Keyword
3D video, Time-of-Flight, depth map acquisition, optimization, 3DTV, ToF, upsampling, super-resolution, sensor fusion
National Category
Computer Systems
Identifiers
urn:nbn:se:miun:diva-21938 (URN)STC (Local ID)978-91-87557-49-1 (ISBN)STC (Archive number)STC (OAI)
Public defence
2014-06-04, L111, Holmgatan 10, Sundsvall, 10:00 (English)
Opponent
Supervisors
Funder
Knowledge Foundation, 2009/0264
Available from: 2014-05-16 Created: 2014-05-14 Last updated: 2017-08-22Bibliographically approved

Open Access in DiVA

fulltext(600 kB)371 downloads
File information
File name FULLTEXT02.pdfFile size 600 kBChecksum SHA-512
61705a3a7861c1af6dad4359a22a1df8fd914911957a986cddd903490c2c15c091e250044db39cf7eec23074817f6edd52e1834051f2e2eb7341d2b183dfbfc8
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Schwarz, SebastianOlsson, RogerSjöström, MårtenTourancheau, Sylvain
By organisation
Department of Information Technology and Media
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 371 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 923 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf