miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Coding for Improved Perceived Quality of 2D and 3D Video over Heterogeneous Networks
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media. (Realistic3d, STC)
2010 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The rapid development of video applications for TV, the internet and mobile phones is being taken one step further in 2010 with the introduction of stereo 3D TV. The 3D experience can be further improved using multiple views in the visualization. The transmission of 2D and 3D video at a sufficiently perceived quality is a challenge considering the diversity in content, the resources of the network and the end-users.Two problems are addressed in this thesis. Firstly, how to improve the perceived quality for an application with a limited bit rate. Secondly, how to ensure the best perceived quality for all end-users in a heterogeneous network.

A solution to the first problem is region-of-interest (ROI) video coding, which adapts the coding to provide a better quality in regions of interest to the viewer. A spatio-temporal filter is proposed to provide codec and standard independent ROI video coding. The filter reduces the number of bits necessary to encode the background and successfully re-allocate these bits to the ROI. The temporal part of the filter reduces the complexity compared to only using a spatial filter. Adaption to the requirements of the transmission channel is possible by controlling the standard deviation of the filter. The filter has also been successfully applied to 3D video in the form of 2D-plus-depth, where the depth data was used in the detection of the ROI.

The second problem can be solved by providing a video sequence that has the best overall quality. Hence, the best quality for each part of the network and for each 2D and 3D visualization system over time. Scalable video coding enables the extraction of the parts of the data to adapt to the requirements of the network and the end-user. A scheme is proposed in this thesis that provides scalability in the depth and view domain of multi-view plus depth video. The data are divided into enhancement layers depending on the content’s distance to the camera. Schemes to divide the data into layers within a view and between adjacent views have been analyzed. The quality evaluation indicates that the position of the layers in depth as well as the number of layers should be determined by analyzing the depth distribution. The front-most layers in adjacent views should be given priority over the others unless the application requires a high quality of the center views.

Abstract [sv]

Den snabba utvecklingen av videoapplikation för TV, Internet och mobiltelefoner tar ytterliggare ett steg i och med introduceringen av stereo 3D TV under 2010. Upplevelsen av 3D kan förstärkas ytterliggare genom att använda multipla vyer i visualiseringen. Skillnaden i innehåll, nätverksresurser och slutanvändare gör överföring av 2D och 3D video med en tillräcklig hög upplevd kvalitet till en utmaning. För det första, hur man ökar den upplevda kvalitén hos en applikation med en begränsad överföringshastighet. För det andra, hur man tillhandahåller den bästa upplevda kvalitén hos alla slutanvändare i ett heterogent nätverk.

Region-of-interest (ROI) videokodning är en lösning till det första problemet, vilken anpassar kodningen för att ge högre kvalitet i regioner som är intressanta för användaren. Ett spatio-temporalt filter är föreslaget för att tillhandahålla codec- och standardoberoende ROI videokodning. Filtret reducerar antalet bitar som krävs för att koda bakgrunden och omfördelar dessa till ROI:t. Den temporala delen av filtret minskar komplexiteten jämfört med att använda enbart spatiala filter. Filtret kan anpassas till överföringshastigheten genom att ändra standardavvikelsen för filtret. Filtret har också˙ använts på˙ 3D video i formen 2D-plus-depth, där djupdata användes i ROI detektionen.

Det andra problemet kan lösas genom att tillhandahålla en videosekvens som har högsta möjliga kvalitet i hela nätverket. Därmed även den bästa kvaliteten för for varje del av nätverket och för varje 2D- och 3D-skärm. Skalbar videokodning gör det möjligt att extrahera delar av datan för anpassning till de rådande förutsättningarna. En metod som ger skalbarhet i djupet och mellan kameravyer hos multi-view plus depth video har föreslagits. Videosekvensen delas upp i lager beroende på innehållets avstånd till kameran. Metoder för att fördela data över lager i djupet och mellan närliggande vyer har analyserats. Kvalitetsutvärderingen visar att lagrens position i djupet och antalet lager bör bestämmas utifrån fördelningen av djupdata. De främsta lagren i närliggande vyer bör ges högre prioritet om inte applikationen kräver hög kvalitet hos vyer i centrum.

Place, publisher, year, edition, pages
Sundsvall: Mittuniversitetet , 2010. , p. 87
Series
Mid Sweden University doctoral thesis, ISSN 1652-893X ; 87
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:miun:diva-11461ISBN: 978-91-86073-78-7 (print)OAI: oai:DiVA.org:miun-11461DiVA, id: diva2:315447
Public defence
2010-05-27, O111, O-Huset, Holmgatan 10, Sundsvall, 10:00 (English)
Opponent
Supervisors
Projects
Medi3D3D-reklamMediaSense
Available from: 2010-04-29 Created: 2010-04-29 Last updated: 2018-01-12Bibliographically approved
List of papers
1. Improved ROI Video Coding using Variable Gaussian Pre-Filters and Variance in Intensity
Open this publication in new window or tab >>Improved ROI Video Coding using Variable Gaussian Pre-Filters and Variance in Intensity
2005 (English)In: IEEE International Conference on Image Processing 2005, ICIP 2005: Vol. 2, 2005, p. 1817-1820, article id 1530054Conference paper, Published paper (Refereed)
Abstract [en]

In applications involving video over mobile phones or Internet, the limited quality depending on the transmission rate can be further improved by region-of-interest (ROI) coding. In this paper we present a preprocessing method using variable Gaussian filters controlled by a quality map indicating the distance to the ROI border. The border effects are reduced introducing a small improvement of the PSNR of the intensity component within the ROI after compression, compared to using only one low pass filter. With the compressed original sequence as a reference, the average PSNR was increased by 1.25 dB and 2.3 dB for 100 kbit/s and 150 kbit/s, respectively. A modified quality map is introduced using variance to exclude pixels, which are not visibly affected by the Gaussian filters, reducing computational complexity. Using less than 76% of the pixels gives no noticeable change in quality.

Series
Proceedings - International Conference on Image Processing, ICIP, ISSN 1522-4880
Keyword
ROI, Region-Of-Interest, Variable filters
National Category
Computer Sciences
Identifiers
urn:nbn:se:miun:diva-5808 (URN)10.1109/ICIP.2005.1530054 (DOI)000235773302067 ()2-s2.0-33749584347 (Scopus ID)3290 (Local ID)0-7803-9134-9 (ISBN)3290 (Archive number)3290 (OAI)
Conference
IEEE International Conference on Image Processing (ICIP 2005), Sep 11-14, 2005, Genoa, Italy
Projects
STC - Sensible Things that Communicate
Available from: 2008-09-30 Created: 2009-07-30 Last updated: 2018-01-12Bibliographically approved
2. Spatio-Temporal Filter for ROI Video Coding
Open this publication in new window or tab >>Spatio-Temporal Filter for ROI Video Coding
2006 (English)In: Proceedings of the 14th European Signal Processing Conference (EUSIPCO 2006) Florence, Italy 4-8.Sept. 2006, 2006Conference paper, Published paper (Other academic)
Abstract [en]

Reallocating resources within a video sequence to the regions-of-interest increases the perceived quality at limited bandwidths. In this paper we combine a spatial filter with a temporal filter, which are both codec and standard independent. This spatio-temporal filter removes resources from both the motion vectors and the prediction error with a computational complexity lower than the spatial filter by itself. This decreases the bit rate by 30-50% compared to coding the original sequence using H.264. The released bits can be used by the codec to increase the PSNR of the ROI by 1.58 - 4.61 dB, which is larger than for the spatial and temporal filters by themselves.

National Category
Computer Sciences
Identifiers
urn:nbn:se:miun:diva-3779 (URN)2-s2.0-84862634356 (Scopus ID)4072 (Local ID)4072 (Archive number)4072 (OAI)
Conference
14th European Signal Processing Conference, EUSIPCO 2006; Florence; Italy; 4 September 2006 through 8 September 2006; Code 90688
Projects
STC - Sensible Things that Communicate
Available from: 2008-09-30 Created: 2009-07-29 Last updated: 2018-01-12Bibliographically approved
3. A Spatio-Temporal Filter for Region-of-Interest Video Coding
Open this publication in new window or tab >>A Spatio-Temporal Filter for Region-of-Interest Video Coding
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Region of interest (ROI) video coding increases the quality in regions interesting to the viewer at the expense of quality in the background. This enables a high perceived quality at low bit rates. A successfully detected ROI can be used to control the bit-allocation in the encoding. In this paper we present a filter that is independent of codec and standard. It is applied in both the spatial and the temporal domains. The filter’s ability to reduce the number of bits necessary to encode the background is analyzed theoretically and where these bits are re-allocated. The computational complexity of the algorithms is also determined. The quality is evaluated using PSNR of the ROI and subjective tests. Test showed that the spatio-temporal filter has a better coding efficiency than using only spatial or only temporal filtering. The filter successfully re-allocates the bits from the background to the foreground.

National Category
Computer Engineering Signal Processing
Identifiers
urn:nbn:se:miun:diva-11434 (URN)
Projects
Medi3DDigital 3D-reklamMediaSense
Available from: 2010-04-21 Created: 2010-04-21 Last updated: 2018-01-12Bibliographically approved
4. Region-of-interest 3D video coding based on depth images
Open this publication in new window or tab >>Region-of-interest 3D video coding based on depth images
2008 (English)In: 2008 3DTV Conference - True Vision - Capture, Transmission and Display of 3D Video, IEEE conference proceedings, 2008, p. 121-124Conference paper, Published paper (Refereed)
Abstract [en]

Three dimensional (3D) TV is becoming a mature technology due to the progress within areas such as display and network technology among others. However, 3D video demands a higher bandwidth in order to transmit the information needed to render or directly display several different views at the receiver. The 2D plus depth representation requires less bit rate than most 3D video representations, although the necessary views have to be rendered at the receiver. In this paper we propose to combine the 2D plus depth representation with region-of-interest (ROI) video coding to ensure a higher quality at parts of the sequence that are of interest to the viewer. These include objects close to the viewer as well as faces. This allows either the bit rate to be reduced by 12-28 % or the quality within the ROI to be increased by 0.57 - 1.5 dB, when a fixed bit rate is applied.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2008
Keyword
Region-of-interest, ROI, 3D video, 2D plus depth
National Category
Computer Sciences
Identifiers
urn:nbn:se:miun:diva-4249 (URN)10.1109/3DTV.2008.4547828 (DOI)000258372100031 ()2-s2.0-50949115457 (Scopus ID)5478 (Local ID)978-1-4244-1760-5 (ISBN)5478 (Archive number)5478 (OAI)
Conference
3DTV Conference - True Vision - Capture, Transmission and Display of 3D Video, May 28-30, 2008, Istanbul, Turkey
Available from: 2010-06-14 Created: 2008-11-19 Last updated: 2018-01-12Bibliographically approved
5. Multiview plus depth scalable coding in the depth domain
Open this publication in new window or tab >>Multiview plus depth scalable coding in the depth domain
2009 (English)In: 3DTV-CON 2009 - 3rd 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video, Proceedings, IEEE conference proceedings, 2009, p. 5069631-Conference paper, Published paper (Refereed)
Abstract [en]

Three dimensional (3D) TV is a growing area that provides an extra dimension at the cost of spatial resolution. The multi-view plus depth representation provides a lower bit rate when it is encoded than multi-view and higher resolution than a 2D-plus-depth sequence. Scalable video coding provides adaption to the conditions at the receiver. In this paper we propose a scheme that combines scalability in both the view and depth domain. The center view data is preserved, whereas the data of the side views are extracted in layers depending on distance to the camera. This allows a decrease in bit rate of 16-39 % for the colour part of a 3-view MV depending number of pixels in the first enhancement layer if one layer is extracted. Each additional layer increases the visual quality and PSNR compared only using center view data.

Place, publisher, year, edition, pages
IEEE conference proceedings, 2009
Keyword
Scalability, 3D video, multiview plus depth
National Category
Computer Sciences
Identifiers
urn:nbn:se:miun:diva-8962 (URN)10.1109/3DTV.2009.5069631 (DOI)000270053600019 ()2-s2.0-70349916088 (Scopus ID)978-1-4244-4318-5 (ISBN)
Conference
3rd 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video, 3DTV-CON 2009; Potsdam; 4 May 2009 through 6 May 2009
Projects
3D ReklamMediaSense
Available from: 2010-06-14 Created: 2009-05-21 Last updated: 2018-01-13Bibliographically approved
6. Performance of scalable coding in depth domain
Open this publication in new window or tab >>Performance of scalable coding in depth domain
2010 (English)In: Proceedings of the SPIE, Vol 7524 : Conference on Stereoscopic Displays and Applications XXI, San Jose, CA, JAN 18-20, 2010. (Proceedings of SPIE-IS&T Electronic Imaging) / [ed] Andrew J. Woods, Nicolas S. Holliman, Neil A. Dodgson, SPIE - International Society for Optical Engineering, 2010, p. 75240A-Conference paper, Published paper (Refereed)
Abstract [en]

Common autostereoscopic 3D displays are based on multi-view projection. The diversity of resolutions and number of views of such displays implies a necessary flexibility of 3D content formats in order to make broadcasting efficient. Furthermore, distribution of content over a heterogeneous network should adapt to an available network capacity. Present scalable video coding provides the ability to adapt to network conditions; it allows for quality, temporal and spatial scaling of 2D video. Scalability for 3D data extends this list to the depth and the view domains. We have introduced scalability with respect to depth information. It allows for an increased number of quality steps; the cost is a slight increase of required capacity for the whole sequence. Our proposed scheme is based on the multi-view-plus-depth format where the center view data are preserved, and side views are extracted in layers depending on depth values. We investigate the performance of various layer assignment strategies: number of layers, and distribution of layers in depth, either based on equal number of pixels or histogram characteristics. We further consider the consequences to variable distortion due to encoder parameters. The results are evaluated considering their bit rate verses distortion as well as visual quality appearance.

Place, publisher, year, edition, pages
SPIE - International Society for Optical Engineering, 2010
Series
Proceedings of SPIE - The International Society for Optical Engineering, ISSN 0277-786X
Keyword
3D video; Compression; Depth values; Multi-view; Scalable coding
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:miun:diva-9997 (URN)10.1117/12.838999 (DOI)000284868500009 ()2-s2.0-77949587834 (Scopus ID)978-081947917-4 (ISBN)
Projects
Medi3DDigital 3D-reklam
Available from: 2010-03-17 Created: 2009-10-08 Last updated: 2017-08-22Bibliographically approved
7. Layer assignment based on depth data distribution for multiview-plus-depth scalable video coding
Open this publication in new window or tab >>Layer assignment based on depth data distribution for multiview-plus-depth scalable video coding
2011 (English)In: IEEE transactions on circuits and systems for video technology (Print), ISSN 1051-8215, E-ISSN 1558-2205, Vol. 21, no 6, p. 742-754Article in journal (Refereed) Published
Abstract [en]

Three dimensional (3D) video is experiencing a rapid growth in a number of areas including 3D cinema, 3DTV and mobile phones. Several problems must to be addressed to display captured 3D video at another location. One problem is how torepresent the data. The multiview plus depth representation of a scene requires a lower bit rate than transmitting all views required by an application and provides more information than a 2D-plus-depth sequence. Another problem is how to handle transmission in a heterogeneous network. Scalable video coding enables adaption of a 3D video sequence to the conditions at the receiver. In this paper we present a scheme that combines scalability based on the position in depth of the data and the distance to the center view. The general scheme preserves the center view data, whereas the data of the remaining views are extracted in enhancement layers depending on distance to the viewer and the center camera. The data is assigned into enhancement layers within a view based on depth data distribution. Strategies concerning the layer assignment between adjacent views are proposed. In general each extracted enhancement layer increases the visual quality and PSNR compared to only using center view data. The bit-rate per layer can be further decreased if depth data is distributed over the enhancement layers. The choice of strategy to assign layers between adjacent views depends on whether quality of the fore-most objects in the scene or the quality of the views close to the center is important.

Keyword
H.264/AVC; multiview; scalable video coding
National Category
Computer Engineering Signal Processing
Identifiers
urn:nbn:se:miun:diva-11435 (URN)10.1109/TCSVT.2011.2130350 (DOI)000291356500005 ()2-s2.0-79957982475 (Scopus ID)STC (Local ID)STC (Archive number)STC (OAI)
Projects
Medi3DDigital 3D-reklam
Available from: 2010-04-21 Created: 2010-04-21 Last updated: 2018-01-12Bibliographically approved

Open Access in DiVA

fulltext(1186 kB)920 downloads
File information
File name FULLTEXT01.pdfFile size 1186 kBChecksum SHA-512
bb9e91ddda510cf46cc93e8855e6542fdd9686332454c8f0ccb79683508db5e2faec2c676ee8005094deb05e79c73857aedb7d20ee4bbe4c9d55eeb98d9a6ffd
Type fulltextMimetype application/pdf

Authority records BETA

Karlsson, Linda Sofia

Search in DiVA

By author/editor
Karlsson, Linda Sofia
By organisation
Department of Information Technology and Media
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 920 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1231 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf