miun.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Li, Yun
Publications (10 of 12) Show all publications
Conti, C., Soares, L. D., Nunes, P., Perra, C., Assunção, P. A., Sjöström, M., . . . Jennehag, U. (2018). Light Field Image Compression. In: Assunção, Pedro Amado, Gotchev, Atanas (Ed.), 3D Visual Content Creation, Coding and Delivery: (pp. 143-176). Cham: Springer
Open this publication in new window or tab >>Light Field Image Compression
Show others...
2018 (English)In: 3D Visual Content Creation, Coding and Delivery / [ed] Assunção, Pedro Amado, Gotchev, Atanas, Cham: Springer, 2018, p. 143-176Chapter in book (Refereed)
Place, publisher, year, edition, pages
Cham: Springer, 2018
Series
Signals and Communication Technology, ISSN 1860-4862
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:miun:diva-34382 (URN)2-s2.0-85063129881 (Scopus ID)978-3-319-77842-6 (ISBN)
Available from: 2018-09-13 Created: 2018-09-13 Last updated: 2019-05-22Bibliographically approved
Li, Y., Sjöström, M., Olsson, R. & Jennehag, U. (2016). Coding of focused plenoptic contents by displacement intra prediction. IEEE transactions on circuits and systems for video technology (Print), 26(7), 1308-1319, Article ID 7137669.
Open this publication in new window or tab >>Coding of focused plenoptic contents by displacement intra prediction
2016 (English)In: IEEE transactions on circuits and systems for video technology (Print), ISSN 1051-8215, E-ISSN 1558-2205, Vol. 26, no 7, p. 1308-1319, article id 7137669Article in journal (Refereed) Published
Abstract [en]

A light field is commonly described by a two-plane representation with four dimensions. Refocused three-dimensional contents can be rendered from light field images. A method for capturing these images is by using cameras with microlens arrays. A dense sampling of the light field results in large amounts of redundant data. Therefore, an efficient compression is vital for a practical use of these data. In this paper, we propose a displacement intra prediction scheme with a maximum of two hypotheses for the compression of plenoptic contents from focused plenoptic cameras. The proposed scheme is further implemented into HEVC. The work is aiming at coding plenoptic captured contents efficiently without knowing underlying camera geometries. In addition, the theoretical analysis of the displacement intra prediction for plenoptic images is explained; the relationship between the compressed captured images and their rendered quality is also analyzed. Evaluation results show that plenoptic contents can be efficiently compressed by the proposed scheme. Bit rate reduction up to 60 percent over HEVC is obtained for plenoptic images, and more than 30 percent is achieved for the tested video sequences.

Keywords
Cameras, Distortion, Encoding, Image coding, Lenses, Microoptics, Rendering (computer graphics), HEVC, Plenoptic images, Plenoptic videos, compression, light field
National Category
Signal Processing Telecommunications
Identifiers
urn:nbn:se:miun:diva-25230 (URN)10.1109/TCSVT.2015.2450333 (DOI)000384075100009 ()2-s2.0-84978658360 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Available from: 2015-06-22 Created: 2015-06-22 Last updated: 2017-12-04Bibliographically approved
Li, Y., Olsson, R. & Sjöström, M. (2016). Compression of Unfocused Plenoptic Images using a Displacement Intra prediction. In: 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016: . Paper presented at 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016; Seattle; United States; 11 July 2016 through 15 July 2016. IEEE Signal Processing Society, Article ID 7574673.
Open this publication in new window or tab >>Compression of Unfocused Plenoptic Images using a Displacement Intra prediction
2016 (English)In: 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016, IEEE Signal Processing Society, 2016, article id 7574673Conference paper, Published paper (Refereed)
Abstract [en]

Plenoptic images are one type of light field contents produced by using a combination of a conventional camera and an additional optical component in the form of microlens arrays, which are positioned in front of the image sensor surface. This camera setup can capture a sub-sampling of the light field with high spatial fidelity over a small range, and with a more coarsely sampled angle range. The earliest applications that leverage on the plenoptic image content is image refocusing, non-linear distribution of out-of-focus areas, SNR vs. resolution trade-offs, and 3D-image creation. All functionalities are provided by using post-processing methods. In this work, we evaluate a compression method that we previously proposed for a different type of plenoptic image (focused or plenoptic camera 2.0 contents) than the unfocused or plenoptic camera 1.0 that is used in this Grand Challenge. The method is an extension of the state-of-the-art video compression standard HEVC where we have brought the capability of bi-directional inter-frame prediction into the spatial prediction. The method is evaluated according to the scheme set out by the Grand Challenge, and the results show a high compression efficiency compared with JPEG, i.e., up to 6 dB improvements for the tested images.

Place, publisher, year, edition, pages
IEEE Signal Processing Society, 2016
Keywords
Light field, plenoptic, HEVC, B-coder
National Category
Media and Communication Technology
Identifiers
urn:nbn:se:miun:diva-27567 (URN)10.1109/ICMEW.2016.7574673 (DOI)000386808400017 ()2-s2.0-84992129718 (Scopus ID)STC (Local ID)978-1-5090-1552-8 (ISBN)STC (Archive number)STC (OAI)
Conference
2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016; Seattle; United States; 11 July 2016 through 15 July 2016
Available from: 2016-05-01 Created: 2016-05-01 Last updated: 2018-04-10Bibliographically approved
Li, Y., Sjöström, M., Olsson, R. & Jennehag, U. (2016). Scalable coding of plenoptic images by using a sparse set and disparities. IEEE Transactions on Image Processing, 25(1), 80-91, Article ID 7321029.
Open this publication in new window or tab >>Scalable coding of plenoptic images by using a sparse set and disparities
2016 (English)In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 25, no 1, p. 80-91, article id 7321029Article in journal (Refereed) Published
Abstract [en]

One of the light field capturing techniques is the focused plenoptic capturing. By placing a microlens array in front of the photosensor, the focused plenoptic cameras capture both spatial and angular information of a scene in each microlens image and across microlens images. The capturing results in significant amount of redundant information, and the captured image is usually of a large resolution. A coding scheme that removes the redundancy before coding can be of advantage for efficient compression, transmission and rendering. In this paper, we propose a lossy coding scheme to efficiently represent plenoptic images. The format contains a sparse image set and its associated disparities. The reconstruction is performed by disparity-based interpolation and inpainting, and the reconstructed image is later employed as a prediction reference for the coding of the full plenoptic image. As an outcome of the representation, the proposed scheme inherits a scalable structure with three layers.The results show that plenoptic images are compressed efficiently with over 60 percent bit rate reduction compared to HEVC intra, and with over 20 percent compared to HEVC block copying mode.

Keywords
image coding, image reconstruction, interpolation, disparity-based interpolation, image inpainting, image reconstruction, lossy coding scheme, microlens image, photosensor, plenoptic images, scalable coding scheme, sparse image set, Cameras, Image coding, Image reconstruction, Lenses, Microoptics, Rendering (computer graphics), Scalability, HEVC, Plenoptic, compression, light field
National Category
Signal Processing Telecommunications
Identifiers
urn:nbn:se:miun:diva-26204 (URN)10.1109/TIP.2015.2498406 (DOI)000378330300006 ()2-s2.0-85009468560 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Available from: 2015-11-03 Created: 2015-11-03 Last updated: 2017-08-22Bibliographically approved
Li, Y., Sjöström, M. & Olsson, R. (2015). Coding of plenoptic images by using a sparse set and disparities. In: Proceedings - IEEE International Conference on Multimedia and Expo: . Paper presented at IEEE International Conference on Multimedia and Expo, ICME 2015; Turin; Italy; 29 June 2015 through 3 July 2015. IEEE conference proceedings
Open this publication in new window or tab >>Coding of plenoptic images by using a sparse set and disparities
2015 (English)In: Proceedings - IEEE International Conference on Multimedia and Expo, IEEE conference proceedings, 2015, p. -Art. no. 7177510Conference paper, Published paper (Refereed)
Abstract [en]

A focused plenoptic camera not only captures the spatial information of a scene but also the angular information. The capturing results in a plenoptic image consisting of multiple microlens images and with a large resolution. In addition, the microlens images are similar to their neighbors. Therefore, an efficient compression method that utilizes this pattern of similarity can reduce coding bit rate and further facilitate the usage of the images. In this paper, we propose an approach for coding of focused plenoptic images by using a representation, which consists of a sparse plenoptic image set and disparities. Based on this representation, a reconstruction method by using interpolation and inpainting is devised to reconstruct the original plenoptic image. As a consequence, instead of coding the original image directly, we encode the sparse image set plus the disparity maps and use the reconstructed image as a prediction reference to encode the original image. The results show that the proposed scheme performs better than HEVC intra with more than 5 dB PSNR or over 60 percent bit rate reduction.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2015
Keywords
Plenoptic, lightfield, HEVC, compression
National Category
Signal Processing Telecommunications
Identifiers
urn:nbn:se:miun:diva-25034 (URN)10.1109/ICME.2015.7177510 (DOI)000380486500133 ()2-s2.0-84946061755 (Scopus ID)STC (Local ID)978-­1-­4799-­7082-­7 (ISBN)STC (Archive number)STC (OAI)
Conference
IEEE International Conference on Multimedia and Expo, ICME 2015; Turin; Italy; 29 June 2015 through 3 July 2015
Available from: 2015-06-01 Created: 2015-06-01 Last updated: 2017-08-22Bibliographically approved
Li, Y. (2015). Coding of Three-dimensional Video Content: Diffusion-based Coding of Depth Images and Displacement Intra-Coding of Plenoptic Contents. (Doctoral dissertation). Sundsvall: Mid Sweden University
Open this publication in new window or tab >>Coding of Three-dimensional Video Content: Diffusion-based Coding of Depth Images and Displacement Intra-Coding of Plenoptic Contents
2015 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

In recent years, the three-dimensional (3D) movie industry has reaped massive commercial success in the theaters. With the advancement of display technologies, more experienced capturing and generation of 3D contents, TV broadcasting, movies, and games in 3D have entered home entertainment, and it is likely that 3D applications will play an important role in many aspects of people's life in a not distant future. 3D video contents contain at least two views from different perspectives for the left and the right eye of viewers. The amount of coded information is doubled if these views are encoded separately. Moreover, for multi-view displays (i.e. different perspectives of a scene in 3D are presented to the viewer at the same time through different angles), either video streams of all the required views must be transmitted to the receiver, or the displays must synthesize the missing views with a subset of the views. The latter approach has been widely proposed to reduce the amount of data being transmitted and make data adjustable to 3D-displays. The virtual views can be synthesized by the Depth Image Based Rendering (DIBR) approach from textures and associated depth images. However, it is still the case that the amount of information for the textures plus the depths presents a significant challenge for the network transmission capacity. Compression techniques are vital to facilitate the transmission. In addition to multi-view and multi-view plus depth for reproducing 3D, light field techniques have recently become a hot topic. The light field capturing aims at acquiring not only spatial but also angular information of a view, and an ideal light field rendering device should be such that the viewers would perceive it as looking through a window. Thus, the light field techniques are a step forward to provide us with a more authentic perception of 3D. Among many light field capturing approaches, focused plenoptic capturing is a solution that utilize microlens arrays. The plenoptic cameras are also portable and commercially available. Multi-view and refocusing can be obtained during post-production from these cameras. However, the captured plenoptic images are of a large size and contain significant amount of a redundant information. An efficient compression of the above mentioned contents will, therefore, increase the availability of content access and provide a better quality experience under the same network capacity constraints. In this thesis, the compression of depth images and of plenoptic contents captured by focused plenoptic cameras are addressed. The depth images can be assumed to be piece-wise smooth. Starting from the properties of depth images, a novel depth image model based on edges and sparse samples is presented, which may also be utilized for depth image post-processing. Based on this model, a depth image coding scheme that explicitly encodes the locations of depth edges is proposed, and the coding scheme has a scalable structure. Furthermore, a compression scheme for block-based 3D-HEVC is also devised, in which diffusion is used for intra prediction. In addition to the proposed schemes, the thesis illustrates several evaluation methodologies, especially the subjective test of the stimulus-comparison method. This is suitable for evaluating the quality of two impaired images, as the objective metrics are inaccurate with respect to synthesized views. For the compression of plenoptic contents, displacement intra prediction with more than one hypothesis is applied and implemented in the HEVC for an efficient prediction. In addition, a scalable coding approach utilizing a sparse set and disparities is introduced for the coding of focused plenoptic images. The MPEG test sequences were used for the evaluation of the proposed depth image compression, and public available plenoptic image and video contents were applied to the assessment of the proposed plenoptic compression. For depth image coding, the results showed that virtual views synthesized from post-processed depth images by using the proposed model are better than those synthesized from original depth images. More importantly, the proposed coding schemes using such a model produced better synthesized views than the state of the art schemes. For the plenoptic contents, the proposed scheme achieved an efficient prediction and reduced the bit rate significantly while providing coding and rendering scalability. As a result, the outcome of the thesis can lead to improving quality of the 3DTV experience and facilitate the development of 3D applications in general.

Place, publisher, year, edition, pages
Sundsvall: Mid Sweden University, 2015. p. 145
Series
Mid Sweden University doctoral thesis, ISSN 1652-893X ; 222
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-25035 (URN)STC (Local ID)978-91-88025-24-1 (ISBN)STC (Archive number)STC (OAI)
Public defence
2015-06-09, L111, Holmgatan 10, Sundsvall, 10:00 (English)
Opponent
Supervisors
Available from: 2015-06-02 Created: 2015-06-01 Last updated: 2017-08-22Bibliographically approved
Li, Y., Sjöström, M., Olsson, R. & Jennehag, U. (2014). Efficient Intra Prediction Scheme For Light Field Image Compression. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings: . Paper presented at 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014; Florence; Italy; 4 May 2014 through 9 May 2014; Category numberCFP14ICA-USB; Code 1066322014 (pp. Art. no. 6853654). IEEE conference proceedings
Open this publication in new window or tab >>Efficient Intra Prediction Scheme For Light Field Image Compression
2014 (English)In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, IEEE conference proceedings, 2014, p. Art. no. 6853654-Conference paper, Published paper (Refereed)
Abstract [en]

Interactive photo-realistic graphics can be rendered by using light field datasets. One way of capturing the dataset is by using light field cameras with microlens arrays. The captured images contain repetitive patterns resulted from adjacent mi-crolenses. These images don't resemble the appearance of a natural scene. This dissimilarity leads to problems in light field image compression by using traditional image and video encoders, which are optimized for natural images and video sequences. In this paper, we introduce the full inter-prediction scheme in HEVC into intra-prediction for the compression of light field images. The proposed scheme is capable of performing both unidirectional and bi-directional prediction within an image. The evaluation results show that above 3 dB quality improvements or above 50 percent bit-rate saving can be achieved in terms of BD-PSNR for the proposed scheme compared to the original HEVC intra-prediction for light field images.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2014
Keywords
Compression, HEVC, Light Field, Microlens array
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-21618 (URN)10.1109/ICASSP.2014.6853654 (DOI)000343655300109 ()2-s2.0-84905255726 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Conference
2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014; Florence; Italy; 4 May 2014 through 9 May 2014; Category numberCFP14ICA-USB; Code 1066322014
Available from: 2014-03-21 Created: 2014-03-21 Last updated: 2017-08-22Bibliographically approved
Li, Y. (2013). Coding of three-dimensional video content: Depth image coding by diffusion. (Licentiate dissertation). Sundsvall: Mid Sweden University
Open this publication in new window or tab >>Coding of three-dimensional video content: Depth image coding by diffusion
2013 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Three-dimensional (3D) movies in theaters have become a massive commercial success during recent years, and it is likely that, with the advancement of display technologies and the production of 3D contents, TV broadcasting in 3D will play an important role in home entertainments in the not too distant future. 3D video contents contain at least two views from different perspectives for the left and the right eye of viewers. The amount of coded information is doubled if these views are encoded separately. Moreover, for multi-view displays (i.e. different perspectives of a scene in 3D are presented to the viewer at the same time through different angles), either video streams of all the required views must be transmitted to the receiver, or the displays must synthesize the missing views with a subset of the views. The latter approach has been widely proposed to reduce the amount of data being transmitted. The virtual views can be synthesized by the Depth Image Based Rendering (DIBR) approach from textures and associated depth images. However it is still the case that the amount of information for the textures plus the depths presents a significant challenge for the network transmission capacity. An efficient compression will, therefore, increase the availability of content access and provide a better video quality under the same network capacity constraints.

In this thesis, the compression of depth images is addressed. These depth images can be assumed as being piece-wise smooth. Starting from the properties of depth images, a novel depth image model based on edges and sparse samples is presented, which may also be utilized for depth image post-processing. Based on this model, a depth image coding scheme that explicitly encodes the locations of depth edges is proposed, and the coding scheme has a scalable structure. Furthermore, a compression scheme for block-based 3D-HEVC is also devised, in which diffusion is used for intra prediction. In addition to the proposed schemes, the thesis illustrates several evaluation methodologies, especially, the subjective test of the stimulus-comparison method. It is suitable for evaluating the quality of two impaired images, as the objective metrics are inaccurate with respect to synthesized views.

The MPEG test sequences were used for the evaluation. The results showed that virtual views synthesized from post-processed depth images by using the proposed model are better than those synthesized from original depth images. More importantly, the proposed coding schemes using such a model produced better synthesized views than the state of the art schemes. As a result, the outcome of the thesis can lead to a better quality of 3DTV experience.

Place, publisher, year, edition, pages
Sundsvall: Mid Sweden University, 2013. p. 36
Series
Mid Sweden University licentiate thesis, ISSN 1652-8948
National Category
Engineering and Technology Signal Processing
Identifiers
urn:nbn:se:miun:diva-19087 (URN)STC (Local ID)978-91-87103-76-6 (ISBN)STC (Archive number)STC (OAI)
Presentation
(English)
Opponent
Supervisors
Available from: 2013-06-11 Created: 2013-06-06 Last updated: 2016-10-20Bibliographically approved
Li, Y., Sjöström, M., Jennehag, U. & Olsson, R. (2013). Depth Image Post-processing Method by Diffusion. In: Proceedings of SPIE-The International Society for Optical Engineering: 3D Image Processing (3DIP) and Applications. Paper presented at 3D Image Processing (3DIP) and Applications 2013; 3-7 Feb 2013; Burlingame, Ca, USA; Conference 8650 (pp. Art. no. 865003). SPIE - International Society for Optical Engineering
Open this publication in new window or tab >>Depth Image Post-processing Method by Diffusion
2013 (English)In: Proceedings of SPIE-The International Society for Optical Engineering: 3D Image Processing (3DIP) and Applications, SPIE - International Society for Optical Engineering, 2013, p. Art. no. 865003-Conference paper, Published paper (Refereed)
Abstract [en]

Multi-view three-dimensional television relies on view synthesis to reduce the number of views being transmitted.  Arbitrary views can be synthesized by utilizing corresponding depth images with textures. The depth images obtained from stereo pairs or range cameras may contain erroneous values, which entail artifacts in a rendered view. Post-processing of the data may then be utilized to enhance the depth image with the purpose to reach a better quality of synthesized views. We propose a Partial Differential Equation (PDE)-based interpolation method for a reconstruction of the smooth areas in depth images, while preserving significant edges. We modeled the depth image by adjusting thresholds for edge detection and a uniform sparse sampling factor followed by the second order PDE interpolation. The objective results show that a depth image processed by the proposed method can achieve a better quality of synthesized views than the original depth image. Visual inspection confirmed the results.

Place, publisher, year, edition, pages
SPIE - International Society for Optical Engineering, 2013
Keywords
Depth image, post-processing, view synthesis
National Category
Signal Processing
Identifiers
urn:nbn:se:miun:diva-18537 (URN)10.1117/12.2003183 (DOI)000322110500001 ()2-s2.0-84878288330 (Scopus ID)STC (Local ID)978-081949423-8 (ISBN)STC (Archive number)STC (OAI)
Conference
3D Image Processing (3DIP) and Applications 2013; 3-7 Feb 2013; Burlingame, Ca, USA; Conference 8650
Available from: 2013-02-27 Created: 2013-02-27 Last updated: 2017-08-22
Li, Y., Sjöström, M., Jennehag, U. & Olsson, R. (2013). Depth Map Compression with Diffusion Modes in 3D-HEVC. In: Philip Davies, David Newell (Ed.), MMEDIA 2013 - 5th International Conferences on Advances in Multimedia: . Paper presented at 5th International Conferences on Advances in Multimedia, MMEDIA 2013; Venice; Italy; 21 April 2013 through 26 April 2013; Code 106822 (pp. 125-129). International Academy, Research and Industry Association (IARIA)
Open this publication in new window or tab >>Depth Map Compression with Diffusion Modes in 3D-HEVC
2013 (English)In: MMEDIA 2013 - 5th International Conferences on Advances in Multimedia / [ed] Philip Davies, David Newell, International Academy, Research and Industry Association (IARIA), 2013, p. 125-129Conference paper, Published paper (Refereed)
Abstract [en]

For three-dimensional television, multiple views can be generated by using the Multi-view Video plus Depth (MVD) format. The depth maps of this format can be compressed efficiently by the 3D extension of High Efficiency Video Coding (3D-HEVC), which has explored the correlations between its two components, texture and associated depth map. In this paper, we introduce two modes for depth map coding into HEVC, where the modes use diffusion. The framework for inter-component prediction of Depth Modeling Modes (DMM) is utilized for the proposed modes. They detect edges from textures and then diffuse an entire block from known adjacent blocks by using Laplace equation constrained by the detected edges. The experimental results show that depth maps can be compressed more efficiently with the proposed diffusion modes, where the bit rate saving can reach 1.25 percentage of the total depth bit rate with a constant quality of synthesized views.

Place, publisher, year, edition, pages
International Academy, Research and Industry Association (IARIA), 2013
Keywords
Depth map coding, Diffusion modes, HEVC
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:miun:diva-18818 (URN)2-s2.0-84905867855 (Scopus ID)978-1-61208-265-3 (ISBN)
Conference
5th International Conferences on Advances in Multimedia, MMEDIA 2013; Venice; Italy; 21 April 2013 through 26 April 2013; Code 106822
Available from: 2013-04-25 Created: 2013-04-25 Last updated: 2017-08-22Bibliographically approved
Organisations

Search in DiVA

Show all publications