miun.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 13) Show all publications
Schwarz, S., Sjöström, M. & Olsson, R. (2016). Depth or disparity map upscaling. us US9525858B2.
Open this publication in new window or tab >>Depth or disparity map upscaling
2016 (English)Patent (Other (popular science, discussion, etc.))
Abstract [en]

Method and arrangement for increasing the resolution of a depth or disparity map related to multi view video. The method comprises deriving a high resolution depth map based on a low resolution depth map and a masked texture image edge map. The masked texture image edge map comprises information on edges in a high resolution texture image, which edges have a correspondence in the low resolution depth map. The texture image and the depth map are associated with the same frame.

National Category
Media and Communication Technology
Identifiers
urn:nbn:se:miun:diva-33387 (URN)
Patent
US US9525858B2 (2016-12-20)
Available from: 2018-04-02 Created: 2018-04-02 Last updated: 2018-11-06Bibliographically approved
Schwarz, S., Sjöström, M. & Olsson, R. (2014). A Weighted Optimization Approach to Time-of-Flight Sensor Fusion. IEEE Transactions on Image Processing, 23(1), 214-225
Open this publication in new window or tab >>A Weighted Optimization Approach to Time-of-Flight Sensor Fusion
2014 (English)In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 23, no 1, p. 214-225Article in journal (Refereed) Published
Abstract [en]

Acquiring scenery depth is a fundamental task in computer vision, with many applications in manufacturing, surveillance, or robotics relying on accurate scenery information. Time-of-flight cameras can provide depth information in real-time and overcome short-comings of traditional stereo analysis. However, they provide limited spatial resolution and sophisticated upscaling algorithms are sought after. In this paper, we present a sensor fusion approach to time-of-flight super resolution, based on the combination of depth and texture sources. Unlike other texture guided approaches, we interpret the depth upscaling process as a weighted energy optimization problem. Three different weights are introduced, employing different available sensor data. The individual weights address object boundaries in depth, depth sensor noise, and temporal consistency. Applied in consecutive order, they form three weighting strategies for time-of-flight super resolution. Objective evaluations show advantages in depth accuracy and for depth image based rendering compared with state-of-the-art depth upscaling. Subjective view synthesis evaluation shows a significant increase in viewer preference by a factor of four in stereoscopic viewing conditions. To the best of our knowledge, this is the first extensive subjective test performed on time-of-flight depth upscaling. Objective and subjective results proof the suitability of our approach to time-of-flight super resolution approach for depth scenery capture.

Place, publisher, year, edition, pages
IEEE Signal Processing Society, 2014
Keywords
Sensor fusion, range data, time-of-flight sensors, depth map upscaling, 3D video, stereo vision
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-20415 (URN)10.1109/TIP.2013.2287613 (DOI)000329195500017 ()2-s2.0-84888373138 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Projects
Realistic3D
Funder
Knowledge Foundation, 2009/0264
Note

This work was supported in part by the KKFoundation of Sweden under Grant 2009/0264, in part by the EU Euro-pean Regional Development Fund, Mellersta Norrland, Sweden, under Grant 00156702, and in part by Länsstyrelsen Västernorrland, Sweden, under Grant 00155148.

Available from: 2013-12-03 Created: 2013-12-03 Last updated: 2017-12-06Bibliographically approved
Schwarz, S. (2014). Gaining Depth: Time-of-Flight Sensor Fusion for Three-Dimensional Video Content Creation. (Doctoral dissertation). Sundsvall: Mittuniversitetet
Open this publication in new window or tab >>Gaining Depth: Time-of-Flight Sensor Fusion for Three-Dimensional Video Content Creation
2014 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The successful revival of three-dimensional (3D) cinema has generated a great deal of interest in 3D video. However, contemporary eyewear-assisted displaying technologies are not well suited for the less restricted scenarios outside movie theaters. The next generation of 3D displays, autostereoscopic multiview displays, overcome the restrictions of traditional stereoscopic 3D and can provide an important boost for 3D television (3DTV). Then again, such displays require scene depth information in order to reduce the amount of necessary input data. Acquiring this information is quite complex and challenging, thus restricting content creators and limiting the amount of available 3D video content. Nonetheless, without broad and innovative 3D television programs, even next-generation 3DTV will lack customer appeal. Therefore simplified 3D video content generation is essential for the medium's success.

This dissertation surveys the advantages and limitations of contemporary 3D video acquisition. Based on these findings, a combination of dedicated depth sensors, so-called Time-of-Flight (ToF) cameras, and video cameras, is investigated with the aim of simplifying 3D video content generation. The concept of Time-of-Flight sensor fusion is analyzed in order to identify suitable courses of action for high quality 3D video acquisition. In order to overcome the main drawback of current Time-of-Flight technology, namely the high sensor noise and low spatial resolution, a weighted optimization approach for Time-of-Flight super-resolution is proposed. This approach incorporates video texture, measurement noise and temporal information for high quality 3D video acquisition from a single video plus Time-of-Flight camera combination. Objective evaluations show benefits with respect to state-of-the-art depth upsampling solutions. Subjective visual quality assessment confirms the objective results, with a significant increase in viewer preference by a factor of four. Furthermore, the presented super-resolution approach can be applied to other applications, such as depth video compression, providing bit rate savings of approximately 10 percent compared to competing depth upsampling solutions. The work presented in this dissertation has been published in two scientific journals and five peer-reviewed conference proceedings. 

In conclusion, Time-of-Flight sensor fusion can help to simplify 3D video content generation, consequently supporting a larger variety of available content. Thus, this dissertation provides important inputs towards broad and innovative 3D video content, hopefully contributing to the future success of next-generation 3DTV.

Place, publisher, year, edition, pages
Sundsvall: Mittuniversitetet, 2014. p. 228
Series
Mid Sweden University doctoral thesis, ISSN 1652-893X ; 185
Keywords
3D video, Time-of-Flight, depth map acquisition, optimization, 3DTV, ToF, upsampling, super-resolution, sensor fusion
National Category
Computer Systems
Identifiers
urn:nbn:se:miun:diva-21938 (URN)STC (Local ID)978-91-87557-49-1 (ISBN)STC (Archive number)STC (OAI)
Public defence
2014-06-04, L111, Holmgatan 10, Sundsvall, 10:00 (English)
Opponent
Supervisors
Funder
Knowledge Foundation, 2009/0264
Available from: 2014-05-16 Created: 2014-05-14 Last updated: 2017-08-22Bibliographically approved
Schwarz, S., Sjöström, M. & Olsson, R. (2014). Multivariate Sensitivity Analysis of Time-of-Flight Sensor Fusion. 3D Research, 5(3)
Open this publication in new window or tab >>Multivariate Sensitivity Analysis of Time-of-Flight Sensor Fusion
2014 (English)In: 3D Research, ISSN 2092-6731, Vol. 5, no 3Article in journal (Refereed) Published
Abstract [en]

Obtaining three-dimensional scenery data is an essential task in computer vision, with diverse applications in various areas such as manufacturing and quality control, security and surveillance, or user interaction and entertainment. Dedicated Time-of-Flight sensors can provide detailed scenery depth in real-time and overcome short-comings of traditional stereo analysis. Nonetheless, they do not provide texture information and have limited spatial resolution. Therefore such sensors are typically combined with high resolution video sensors. Time-of-Flight Sensor Fusion is a highly active field of research. Over the recent years, there have been multiple proposals addressing important topics such as texture-guided depth upsampling and depth data denoising. In this article we take a step back and look at the underlying principles of ToF sensor fusion. We derive the ToF sensor fusion error model and evaluate its sensitivity to inaccuracies in camera calibration and depth measurements. In accordance with our findings, we propose certain courses of action to ensure high quality fusion results. With this multivariate sensitivity analysis of the ToF sensor fusion model, we provide an important guideline for designing, calibrating and running a sophisticated Time-of-Flight sensor fusion capture systems.

Place, publisher, year, edition, pages
Springer Publishing Company, 2014
Keywords
sensor fusion; model sensitivity; range data; time-of-flight sensors; depth map upsampling; threedimensional video
National Category
Signal Processing Media and Communication Technology
Identifiers
urn:nbn:se:miun:diva-22572 (URN)10.1007/s13319-014-0018-3 (DOI)2-s2.0-84919934468 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Available from: 2014-08-14 Created: 2014-08-14 Last updated: 2018-01-11Bibliographically approved
Schwarz, S., Sjöström, M. & Olsson, R. (2014). Temporal Consistent Depth Map Upscaling for 3DTV. In: Atilla M. Baskurt; Robert Sitnik (Ed.), Proceedings of SPIE - The International Society for Optical Engineering: Three-Dimensional Image Processing, Measurement (3DIPM), and Applications 2014. Paper presented at Three-Dimensional Image Processing, Measurement (3DIPM), and Applications 2014; San Francisco, CA; United States; 5 February 2014 through 5 February 2014; Code 105376 (pp. Art. no. 901302).
Open this publication in new window or tab >>Temporal Consistent Depth Map Upscaling for 3DTV
2014 (English)In: Proceedings of SPIE - The International Society for Optical Engineering: Three-Dimensional Image Processing, Measurement (3DIPM), and Applications 2014 / [ed] Atilla M. Baskurt; Robert Sitnik, 2014, p. Art. no. 901302-Conference paper, Published paper (Refereed)
Abstract [en]

Precise scene depth information is a pre-requisite in three-dimen-sional television (3DTV), e.g. for high quality view synthesis in autostereoscopic multiview displays. Unfortunately, this information is not easily obtained and often of limited quality. Dedicated range sensors, such as time-of-flight (ToF) cameras, can deliver reliable depth information where (stereo-)matching fails. Nonetheless, since these sensors provide only restricted spatial resolution, sophisticated upscaling methods are sought-after, to match depth information to corresponding texture frames.Where traditional upscaling fails, novel approaches have been proposed, utilizing additional information from the texture for the depth upscaling process. We recently proposed the Edge Weighted Optimization Concept (EWOC) for ToF upscaling, using texture edges for accurate depth boundaries. In this paper we propose an important update to EWOC, dividing it into smaller incremental upscaling steps. We predict two major improvements from this. Firstly, processing time should be decreased by dividing one big calculation into several smaller steps. Secondly, we assume an increase in quality for the upscaled depth map, due to a more coherent edge detection on the video frame.In our evaluations we can show the desired effect on processing time, cutting down the calculation time more than in half. We can also show an increase in visual quality, based on objective quality metrics, compared to the original implementation as well as competing proposals.

Series
Conference volume ; 9013
Keywords
3d video, 3DTV, depth aqcuisition, optical flow, depth map upscaling
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-21927 (URN)10.1117/12.2032697 (DOI)000336030900001 ()2-s2.0-84901748474 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Conference
Three-Dimensional Image Processing, Measurement (3DIPM), and Applications 2014; San Francisco, CA; United States; 5 February 2014 through 5 February 2014; Code 105376
Available from: 2014-05-12 Created: 2014-05-12 Last updated: 2017-08-22Bibliographically approved
Schwarz, S., Sjöström, M. & Olsson, R. (2014). Time-of-Flight Sensor Fusion with Depth Measurement Reliability Weighting. In: 3DTV-Conference: . Paper presented at 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video, 3DTV-CON 2014; Budapest; Hungary; 2 July 2014 through 4 July 2014; Category numberCFP1455B-ART; Code 107089 (pp. Art. no. 6874759). IEEE Computer Society
Open this publication in new window or tab >>Time-of-Flight Sensor Fusion with Depth Measurement Reliability Weighting
2014 (English)In: 3DTV-Conference, IEEE Computer Society, 2014, p. Art. no. 6874759-Conference paper, Published paper (Refereed)
Abstract [en]

Accurate scene depth capture is essential for the success of three-dimensional television (3DTV), e.g. for high quality view synthesis in autostereoscopic multiview displays. Unfortunately, scene depth is not easily obtained and often of limited quality. Dedicated Time-of-Flight (ToF) sensors can deliver reliable depth readings where traditional methods, such as stereovision analysis, fail. However, since ToF sensors provide only limited spatial resolution and suffer from sensor noise, sophisticated upsampling methods are sought after. A multitude of ToF solutions have been proposed over the recent years. Most of them achieve ToF super-resolution (TSR) by sensor fusion between ToF and additional sources, e.g. video. We recently proposed a weighted error energy minimization approach for ToF super-resolution, incorporating texture, sensor noise and temporal information. For this article, we take a closer look at the sensor noise weighting related to the Time-of-Flight active brightness signal. We determine a depth measurement reliability function based on optimizing free parameters to test data and verifying it with independent test cases. In the presented doubleweighted TSR proposal, depth readings are weighted into the upsampling process with regard to their reliability, removing erroneous influences in the final result. Our evaluations prove the desired effect of depth measurement reliability weighting, decreasing the depth upsampling error by almost 40% in comparison to competing proposals.

Place, publisher, year, edition, pages
IEEE Computer Society, 2014
Keywords
time-of-flight, active brightness, sensor-fusion, super-resolution, 3D video, scene depth, depth map upsampling
National Category
Media and Communication Technology Signal Processing
Identifiers
urn:nbn:se:miun:diva-22573 (URN)10.1109/3DTV.2014.6874759 (DOI)000345738600049 ()2-s2.0-84906568735 (Scopus ID)STC (Local ID)978-1-4799-4758-4 (ISBN)STC (Archive number)STC (OAI)
Conference
3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video, 3DTV-CON 2014; Budapest; Hungary; 2 July 2014 through 4 July 2014; Category numberCFP1455B-ART; Code 107089
Available from: 2014-08-14 Created: 2014-08-14 Last updated: 2018-01-11Bibliographically approved
Schwarz, S., Olsson, R. & Sjöström, M. (2013). Depth Sensing for 3DTV: A Survey. IEEE Multimedia, 20(4), 10-17
Open this publication in new window or tab >>Depth Sensing for 3DTV: A Survey
2013 (English)In: IEEE Multimedia, ISSN 1070-986X, E-ISSN 1941-0166, Vol. 20, no 4, p. 10-17Article in journal (Refereed) Published
Abstract [en]

In the context of 3D video systems, depth information could be used to render a scene from additional viewpoints. Although there have been many recent advances in this area, including the introduction of the Microsoft Kinect sensor, the robust acquisition of such information continues to be a challenge. This article reviews three depth-sensing approaches for 3DTV. The authors discuss several approaches for acquiring depth information and provides a comparative analysis of their characteristics.

Place, publisher, year, edition, pages
IEEE Computer Society, 2013
Keywords
3D video, scene acquisition, capture, depth sensing, stereo analysis, structured lighting, time-of-flight, sensor fusion
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-20416 (URN)10.1109/MMUL.2013.53 (DOI)000327723900007 ()2-s2.0-84890069117 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Projects
Realistic3D
Funder
Knowledge Foundation, 2009/0264
Note

This work has been supportedby grant 2009/0264 of the KK Foundation, Sweden; grant 00156702 of the EU European Regional Development Fund,Mellersta Norrland, Sweden; and grant 00155148 ofLänsstyrelsenVästernorrland, Sweden.

Available from: 2013-12-03 Created: 2013-12-03 Last updated: 2017-12-06Bibliographically approved
Schwarz, S., Olsson, R., Sjöström, M. & Tourancheau, S. (2012). Adaptive depth filtering for HEVC 3D video coding. In: 2012 Picture Coding Symposium, PCS 2012, Proceedings: . Paper presented at 29th Picture Coding Symposium, PCS 2012;Krakow;7 May 2012through9 May 2012;Code91163 (pp. 49-52). IEEE conference proceedings
Open this publication in new window or tab >>Adaptive depth filtering for HEVC 3D video coding
2012 (English)In: 2012 Picture Coding Symposium, PCS 2012, Proceedings, IEEE conference proceedings, 2012, p. 49-52Conference paper, Published paper (Refereed)
Abstract [en]

Consumer interest in 3D television (3DTV) is growing steadily, but current available 3D displays still need additional eye-wear and suffer from the limitation of a single stereo view pair. So it can be assumed that auto-stereoscopic multiview displays are the next step in 3D-at-home entertainment, since these displays can utilize the Multiview Video plus Depth (MVD) format to synthesize numerous viewing angles from only a small set of given input views. This motivates efficient MVD compression as an important keystone for commercial success of 3DTV. In this paper we concentrate on the compression of depth information in an MVD scenario. There have been several publications suggesting depth down- and upsampling to increase coding efficiency. We follow this path, using our recently introduced Edge Weighted Optimization Concept (EWOC) for depth upscaling. EWOC uses edge information from the video frame in the upscaling process and allows the use of sparse, non-uniformly distributed depth values. We exploit this fact to expand the depth down-/upsampling idea with an adaptive low-pass filter, reducing high energy parts in the original depth map prior to subsampling and compression. Objective results show the viability of our approach for depth map compression with up-to-date High-Efficiency Video Coding (HEVC). For the same Y-PSNR in synthesized views we achieve up to 18.5% bit rate decrease compared to full-scale depth and around 10% compared to competing depth down-/upsampling solutions. These results were confirmed by a subjective quality assessment, showing a statistical significant preference for 87.5% of the test cases.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2012
Keywords
3-D displays; 3-D television; 3D video coding; Auto stereoscopic; Bit rates; Coding efficiency; Consumer interests; Depth information; Depth Map; Depth value; Edge information; High energy; Multiview displays; Multiview video; Stereo view; Subjective quality assessments; Test case; Upsampling; Upscaling; Video frame; Viewing angle
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-16211 (URN)10.1109/PCS.2012.6213283 (DOI)000306962400013 ()2-s2.0-84864026988 (Scopus ID)STC (Local ID)978-1-4577-2048-2 (ISBN)STC (Archive number)STC (OAI)
Conference
29th Picture Coding Symposium, PCS 2012;Krakow;7 May 2012through9 May 2012;Code91163
Projects
Realistic3D
Available from: 2012-09-14 Created: 2012-05-14 Last updated: 2017-08-22Bibliographically approved
Olsson, R., Adhikarla, V. K., Schwarz, S. & Sjöström, M. (2012). Converting conventional stereo pairs to multi-view sequences using morphing. In: Proceedings of SPIE - The International Society for Optical Engineering: . Paper presented at Stereoscopic Displays and Applications XXIII;Burlingame, CA;23 January 2012through25 January 2012;Code89426 (pp. Art. no. 828828). SPIE - International Society for Optical Engineering
Open this publication in new window or tab >>Converting conventional stereo pairs to multi-view sequences using morphing
2012 (English)In: Proceedings of SPIE - The International Society for Optical Engineering, SPIE - International Society for Optical Engineering, 2012, p. Art. no. 828828-Conference paper, Published paper (Refereed)
Abstract [en]

Autostereoscopic multi view displays require multiple views of a scene to provide motion parallax. When an observer changes viewing angle different stereoscopic pairs are perceived. This allows new perspectives of the scene to be seen giving a more realistic 3D experience. However, capturing arbitrary number of views is at best cumbersome, and in some occasions impossible. Conventional stereo video (CSV) operates on two video signals captured using two cameras at two different perspectives. Generation and transmission of two views is more feasible than that of multiple views. It would be more efficient if multiple views required by an autostereoscopic display can be synthesized from these sparse set of views. This paper addresses the conversion of stereoscopic video to multiview video using the video effect morphing. Different morphing algorithms are implemented and evaluated. Contrary to traditional conversion methods, these algorithms disregard the physical depth explicitly and instead generate intermediate views using sparse sets of correspondence features and image morphing. A novel morphing algorithm is also presented that uses scale invariant feature transform (SIFT) and segmentation to construct robust correspondences features and qualitative intermediate views. All algorithms are evaluated on a subjective and objective basis and the comparison results are presented.

Place, publisher, year, edition, pages
SPIE - International Society for Optical Engineering, 2012
Keywords
3D, stereo to multiview conversion, view synthesis, warping, field morphing
National Category
Media Engineering Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:miun:diva-15806 (URN)10.1117/12.909253 (DOI)000302558300075 ()2-s2.0-84860147557 (Scopus ID)STC (Local ID)978-081948935-7 (ISBN)STC (Archive number)STC (OAI)
Conference
Stereoscopic Displays and Applications XXIII;Burlingame, CA;23 January 2012through25 January 2012;Code89426
Available from: 2012-01-31 Created: 2012-01-31 Last updated: 2017-08-22Bibliographically approved
Schwarz, S. (2012). Depth Map Upscaling for Three-Dimensional Television: The Edge-Weighted Optimization Concept. (Licentiate dissertation). Sundsvall, Sweden: Mittuniversitetet
Open this publication in new window or tab >>Depth Map Upscaling for Three-Dimensional Television: The Edge-Weighted Optimization Concept
2012 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

With the recent comeback of three-dimensional (3D) movies to the cinemas, there have been increasing efforts to spread the commercial success of 3D to new markets. The possibility of a 3D experience at home, such as three-dimensional television (3DTV), has generated a great deal of interest within the research and standardization community.

A central issue for 3DTV is the creation and representation of 3D content. Scene depth information plays a crucial role in all parts of the distribution chain from content capture via transmission to the actual 3D display. This depth information is transmitted in the form of depth maps and is accompanied by corresponding video frames, i.e. for Depth Image Based Rendering (DIBR) view synthesis. Nonetheless, scenarios do exist for which the original spatial resolutions of depth maps and video frames do not match, e.g. sensor driven depth capture or asymmetric 3D video coding. This resolution discrepancy is a problem, since DIBR requires accordance between the video frame and depth map. A considerable amount of research has been conducted into ways to match low-resolution depth maps to high resolution video frames. Many proposed solutions utilize corresponding texture information in the upscaling process, however they mostly fail to review this information for validity.

In the strive for better 3DTV quality, this thesis presents the Edge-Weighted Optimization Concept (EWOC), a novel texture-guided depth upscaling application that addresses the lack of information validation. EWOC uses edge information from video frames as guidance in the depth upscaling process and, additionally, confirms this information based on the original low resolution depth. Over the course of four publications, EWOC is applied in 3D content creation and distribution. Various guidance sources, such as different color spaces or texture pre-processing, are investigated. An alternative depth compression scheme, based on depth map upscaling, is proposed and extensions for increased visual quality and computational performance are presented in this thesis. EWOC was evaluated and compared with competing approaches, with the main focus was consistently on the visual quality of rendered 3D views. The results show an increase in both objective and subjective visual quality to state-of-the-art depth map upscaling methods. This quality gain motivates the choice of EWOC in applications affected by low resolution depth.

In the end, EWOC can improve 3D content generation and distribution, enhancing the 3D experience to boost the commercial success of 3DTV.

Place, publisher, year, edition, pages
Sundsvall, Sweden: Mittuniversitetet, 2012. p. 57
Series
Mid Sweden University licentiate thesis, ISSN 1652-8948 ; 92
Keywords
3d video, 3DTV, video coding, capture, distribution, EWOC, depth map upscaling, time-of-flight
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-17048 (URN)978-91-87103-41-4 (ISBN)
Presentation
2012-11-22, O111, Mittuniversitetet - Holmgatan 10, Sundsvall, 09:00 (English)
Opponent
Supervisors
Available from: 2012-10-22 Created: 2012-09-24 Last updated: 2017-08-22Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-2578-7896

Search in DiVA

Show all publications