miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improved ROI Video Coding using Variable Gaussian Pre-Filters and Variance in Intensity
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media. (Realistic3D, SensibleReality, MUCOM)
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media. (Realistic3D, SensibleReality, MUCOM)ORCID iD: 0000-0003-3751-6089
Responsible organisation
2005 (English)In: IEEE International Conference on Image Processing 2005, ICIP 2005: Vol. 2, 2005, 1817-1820 p., 1530054Conference paper, (Refereed)
Abstract [en]

In applications involving video over mobile phones or Internet, the limited quality depending on the transmission rate can be further improved by region-of-interest (ROI) coding. In this paper we present a preprocessing method using variable Gaussian filters controlled by a quality map indicating the distance to the ROI border. The border effects are reduced introducing a small improvement of the PSNR of the intensity component within the ROI after compression, compared to using only one low pass filter. With the compressed original sequence as a reference, the average PSNR was increased by 1.25 dB and 2.3 dB for 100 kbit/s and 150 kbit/s, respectively. A modified quality map is introduced using variance to exclude pixels, which are not visibly affected by the Gaussian filters, reducing computational complexity. Using less than 76% of the pixels gives no noticeable change in quality.

Place, publisher, year, edition, pages
2005. 1817-1820 p., 1530054
Series
Proceedings - International Conference on Image Processing, ICIP, ISSN 1522-4880
Keyword [en]
ROI, Region-Of-Interest, Variable filters
National Category
Computer Science
Identifiers
URN: urn:nbn:se:miun:diva-5808DOI: 10.1109/ICIP.2005.1530054ISI: 000235773302067Scopus ID: 2-s2.0-33749584347Local ID: 3290ISBN: 0-7803-9134-9 (print)OAI: oai:DiVA.org:miun-5808DiVA: diva2:30841
Conference
IEEE International Conference on Image Processing (ICIP 2005), Sep 11-14, 2005, Genoa, Italy
Projects
STC - Sensible Things that Communicate
Available from: 2008-09-30 Created: 2009-07-30 Last updated: 2017-08-22Bibliographically approved
In thesis
1. Spatio-Temporal Pre-Processing Methods for Region-of-Interest Video Coding
Open this publication in new window or tab >>Spatio-Temporal Pre-Processing Methods for Region-of-Interest Video Coding
2007 (English)Licentiate thesis, monograph (Other academic)
Abstract [en]

In video transmission at low bit rates the challenge is to compress the video with a minimal reduction of the percieved quality. The compression can be adapted to knowledge of which regions in the video sequence are of most interest to the viewer. Region of interest (ROI) video coding uses this information to control the allocation of bits to the background and the ROI. The aim is to increase the quality in the ROI at the expense of the quality in the background. In order for this to occur the typical content of an ROI for a particular application is firstly determined and the actual detection is performed based on this information. The allocation of bits can then be controlled based on the result of the detection.

In this licenciate thesis existing methods to control bit allocation in ROI video coding are investigated. In particular pre-processing methods that are applied independently of the codec or standard. This makes it possible to apply the method directly to the video sequence without modifications to the codec. Three filters are proposed in this thesis based on previous approaches. The spatial filter that only modifies the background within a single frame and the temporal filter that uses information from the previous frame. These two filters are also combined into a spatio-temporal filter. The abilities of these filters to reduce the number of bits necessary to encode the background and to successfully re-allocate these to the ROI are investigated. In addition the computational compexities of the algorithms are analysed.

The theoretical analysis is verified by quantitative tests. These include measuring the quality using both the PSNR of the ROI and the border of the background, as well as subjective tests with human test subjects and an analysis of motion vector statistics.

The qualitative analysis shows that the spatio-temporal filter has a better coding efficiency than the other filters and it successfully re-allocates the bits from the foreground to the background. The spatio-temporal filter gives an improvement in average PSNR in the ROI of more than 1.32 dB or a reduction in bitrate of 31 % compared to the encoding of the original sequence. This result is similar to or slightly better than the spatial filter. However, the spatio-temporal filter has a better performance, since its computational complexity is lower than that of the spatial filter.

Place, publisher, year, edition, pages
Sundsvall: Mid Sweden Univ, 2007. 112 p.
Series
Mid Sweden University licentiate thesis, ISSN 1652-8948 ; 21
Keyword
Region-of-interest, video coding, pre-processing, spatio-temporal filters
National Category
Information Science
Identifiers
urn:nbn:se:miun:diva-51 (URN)5113 (Local ID)978-91-85317-45-5 (ISBN)5113 (Archive number)5113 (OAI)
Presentation
2007-04-27, L111, L, Mittuniversitetet, Sundsvall, 13:00 (English)
Opponent
Supervisors
Available from: 2007-12-20 Created: 2007-12-20 Last updated: 2010-06-11Bibliographically approved
2. Coding for Improved Perceived Quality of 2D and 3D Video over Heterogeneous Networks
Open this publication in new window or tab >>Coding for Improved Perceived Quality of 2D and 3D Video over Heterogeneous Networks
2010 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The rapid development of video applications for TV, the internet and mobile phones is being taken one step further in 2010 with the introduction of stereo 3D TV. The 3D experience can be further improved using multiple views in the visualization. The transmission of 2D and 3D video at a sufficiently perceived quality is a challenge considering the diversity in content, the resources of the network and the end-users.Two problems are addressed in this thesis. Firstly, how to improve the perceived quality for an application with a limited bit rate. Secondly, how to ensure the best perceived quality for all end-users in a heterogeneous network.

A solution to the first problem is region-of-interest (ROI) video coding, which adapts the coding to provide a better quality in regions of interest to the viewer. A spatio-temporal filter is proposed to provide codec and standard independent ROI video coding. The filter reduces the number of bits necessary to encode the background and successfully re-allocate these bits to the ROI. The temporal part of the filter reduces the complexity compared to only using a spatial filter. Adaption to the requirements of the transmission channel is possible by controlling the standard deviation of the filter. The filter has also been successfully applied to 3D video in the form of 2D-plus-depth, where the depth data was used in the detection of the ROI.

The second problem can be solved by providing a video sequence that has the best overall quality. Hence, the best quality for each part of the network and for each 2D and 3D visualization system over time. Scalable video coding enables the extraction of the parts of the data to adapt to the requirements of the network and the end-user. A scheme is proposed in this thesis that provides scalability in the depth and view domain of multi-view plus depth video. The data are divided into enhancement layers depending on the content’s distance to the camera. Schemes to divide the data into layers within a view and between adjacent views have been analyzed. The quality evaluation indicates that the position of the layers in depth as well as the number of layers should be determined by analyzing the depth distribution. The front-most layers in adjacent views should be given priority over the others unless the application requires a high quality of the center views.

Abstract [sv]

Den snabba utvecklingen av videoapplikation för TV, Internet och mobiltelefoner tar ytterliggare ett steg i och med introduceringen av stereo 3D TV under 2010. Upplevelsen av 3D kan förstärkas ytterliggare genom att använda multipla vyer i visualiseringen. Skillnaden i innehåll, nätverksresurser och slutanvändare gör överföring av 2D och 3D video med en tillräcklig hög upplevd kvalitet till en utmaning. För det första, hur man ökar den upplevda kvalitén hos en applikation med en begränsad överföringshastighet. För det andra, hur man tillhandahåller den bästa upplevda kvalitén hos alla slutanvändare i ett heterogent nätverk.

Region-of-interest (ROI) videokodning är en lösning till det första problemet, vilken anpassar kodningen för att ge högre kvalitet i regioner som är intressanta för användaren. Ett spatio-temporalt filter är föreslaget för att tillhandahålla codec- och standardoberoende ROI videokodning. Filtret reducerar antalet bitar som krävs för att koda bakgrunden och omfördelar dessa till ROI:t. Den temporala delen av filtret minskar komplexiteten jämfört med att använda enbart spatiala filter. Filtret kan anpassas till överföringshastigheten genom att ändra standardavvikelsen för filtret. Filtret har också˙ använts på˙ 3D video i formen 2D-plus-depth, där djupdata användes i ROI detektionen.

Det andra problemet kan lösas genom att tillhandahålla en videosekvens som har högsta möjliga kvalitet i hela nätverket. Därmed även den bästa kvaliteten för for varje del av nätverket och för varje 2D- och 3D-skärm. Skalbar videokodning gör det möjligt att extrahera delar av datan för anpassning till de rådande förutsättningarna. En metod som ger skalbarhet i djupet och mellan kameravyer hos multi-view plus depth video har föreslagits. Videosekvensen delas upp i lager beroende på innehållets avstånd till kameran. Metoder för att fördela data över lager i djupet och mellan närliggande vyer har analyserats. Kvalitetsutvärderingen visar att lagrens position i djupet och antalet lager bör bestämmas utifrån fördelningen av djupdata. De främsta lagren i närliggande vyer bör ges högre prioritet om inte applikationen kräver hög kvalitet hos vyer i centrum.

Place, publisher, year, edition, pages
Sundsvall: Mittuniversitetet, 2010. 87 p.
Series
Mid Sweden University doctoral thesis, ISSN 1652-893X ; 87
National Category
Computer and Information Science
Identifiers
urn:nbn:se:miun:diva-11461 (URN)978-91-86073-78-7 (ISBN)
Public defence
2010-05-27, O111, O-Huset, Holmgatan 10, Sundsvall, 10:00 (English)
Opponent
Supervisors
Projects
Medi3D3D-reklamMediaSense
Available from: 2010-04-29 Created: 2010-04-29 Last updated: 2017-08-22Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Karlsson, LindaSjöström, Mårten
By organisation
Department of Information Technology and Media
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 902 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf