miun.sePublikasjoner
Endre søk
Begrens søket
12 1 - 50 of 71
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Ahmad, Waqas
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationssystem och –teknologi.
    Ghafoor, Mubeen
    COMSATS University Islamabad, Pakistan.
    Tariq, Syed Ali
    COMSATS University Islamabad, Pakistan.
    Hassan, Ali
    COMSATS University Islamabad, Pakistan.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationssystem och –teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationssystem och –teknologi.
    Computationally Efficient Light Field Image Compression Using a Multiview HEVC Framework2019Inngår i: IEEE Access, E-ISSN 2169-3536, Vol. 7, s. 143002-143014Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    The acquisition of the spatial and angular information of a scene using light eld (LF) technologies supplement a wide range of post-processing applications, such as scene reconstruction, refocusing, virtual view synthesis, and so forth. The additional angular information possessed by LF data increases the size of the overall data captured while offering the same spatial resolution. The main contributor to the size of captured data (i.e., angular information) contains a high correlation that is exploited by state-of-the-art video encoders by treating the LF as a pseudo video sequence (PVS). The interpretation of LF as a single PVS restricts the encoding scheme to only utilize a single-dimensional angular correlation present in the LF data. In this paper, we present an LF compression framework that efciently exploits the spatial and angular correlation using a multiview extension of high-efciency video coding (MV-HEVC). The input LF views are converted into multiple PVSs and are organized hierarchically. The rate-allocation scheme takes into account the assigned organization of frames and distributes quality/bits among them accordingly. Subsequently, the reference picture selection scheme prioritizes the reference frames based on the assigned quality. The proposed compression scheme is evaluated by following the common test conditions set by JPEG Pleno. The proposed scheme performs 0.75 dB better compared to state-of-the-art compression schemes and 2.5 dB better compared to the x265-based JPEG Pleno anchor scheme. Moreover, an optimized motionsearch scheme is proposed in the framework that reduces the computational complexity (in terms of the sum of absolute difference [SAD] computations) of motion estimation by up to 87% with a negligible loss in visual quality (approximately 0.05 dB).

  • 2.
    Ahmad, Waqas
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Interpreting Plenoptic Images as Multi-View Sequences for Improved Compression2017Dataset
    Abstract [en]

    The paper is written in the response to ICIP 2017, Grand challenge on plenoptic image compression. The input image format and compression rates set out by the competition are followed to estimate the results.

  • 3.
    Ahmad, Waqas
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Interpreting Plenoptic Images as Multi-View Sequences for Improved Compression2017Inngår i: ICIP 2017, IEEE, 2017, s. 4557-4561Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Over the last decade, advancements in optical devices have made it possible for new novel image acquisition technologies to appear. Angular information for each spatial point is acquired in addition to the spatial information of the scene that enables 3D scene reconstruction and various post-processing effects. Current generation of plenoptic cameras spatially multiplex the angular information, which implies an increase in image resolution to retain the level of spatial information gathered by conventional cameras. In this work, the resulting plenoptic image is interpreted as a multi-view sequence that is efficiently compressed using the multi-view extension of high efficiency video coding (MV-HEVC). A novel two dimensional weighted prediction and rate allocation scheme is proposed to adopt the HEVC compression structure to the plenoptic image properties. The proposed coding approach is a response to ICIP 2017 Grand Challenge: Light field Image Coding. The proposed scheme outperforms all ICME contestants, and improves on the JPEG-anchor of ICME with an average PSNR gain of 7.5 dB and the HEVC-anchor of ICIP 2017 Grand Challenge with an average PSNR gain of 2.4 dB.

  • 4.
    Ahmad, Waqas
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Towards a generic compression solution for densely and sparsely sampled light field data2018Inngår i: Proceedings of 25TH IEEE International Conference On Image Processing, 2018, s. 654-658, artikkel-id 8451051Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Light field (LF) acquisition technologies capture the spatial and angular information present in scenes. The angular information paves the way for various post-processing applications such as scene reconstruction, refocusing, and synthetic aperture. The light field is usually captured by a single plenoptic camera or by multiple traditional cameras. The former captures a dense LF, while the latter captures a sparse LF. This paper presents a generic compression scheme that efficiently compresses both densely and sparsely sampled LFs. A plenoptic image is converted into sub-aperture images, and each sub-aperture image is interpreted as a frame of a multiview sequence. In comparison, each view of the multi-camera system is treated as a frame of a multi-view sequence. The multi-view extension of high efficiency video coding (MVHEVC) is used to encode the pseudo multi-view sequence.This paper proposes an adaptive prediction and rate allocation scheme that efficiently compresses LF data irrespective of the acquisition technology used.

  • 5.
    Ahmad, Waqas
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Compression scheme for sparsely sampled light field data based on pseudo multi-view sequences2018Inngår i: OPTICS, PHOTONICS, AND DIGITAL TECHNOLOGIES FOR IMAGING APPLICATIONS V Proceedings of SPIE - The International Society for Optical Engineering, SPIE - International Society for Optical Engineering, 2018, Vol. 10679, artikkel-id 106790MKonferansepaper (Fagfellevurdert)
    Abstract [en]

    With the advent of light field acquisition technologies, the captured information of the scene is enriched by having both angular and spatial information. The captured information provides additional capabilities in the post processing stage, e.g. refocusing, 3D scene reconstruction, synthetic aperture etc. Light field capturing devices are classified in two categories. In the first category, a single plenoptic camera is used to capture a densely sampled light field, and in second category, multiple traditional cameras are used to capture a sparsely sampled light field. In both cases, the size of captured data increases with the additional angular information. The recent call for proposal related to compression of light field data by JPEG, also called “JPEG Pleno”, reflects the need of a new and efficient light field compression solution. In this paper, we propose a compression solution for sparsely sampled light field data. In a multi-camera system, each view depicts the scene from a single perspective. We propose to interpret each single view as a frame of pseudo video sequence. In this way, complete MxN views of multi-camera system are treated as M pseudo video sequences, where each pseudo video sequence contains N frames. The central pseudo video sequence is taken as base View and first frame in all the pseudo video sequences is taken as base Picture Order Count (POC). The frame contained in base view and base POC is labeled as base frame. The remaining frames are divided into three predictor levels. Frames placed in each successive level can take prediction from previously encoded frames. However, the frames assigned with last prediction level are not used for prediction of other frames. Moreover, the rate-allocation for each frame is performed by taking into account its predictor level, its frame distance and view wise decoding distance relative to the base frame. The multi-view extension of high efficiency video coding (MV-HEVC) is used to compress the pseudo multi-view sequences. The MV-HEVC compression standard enables the frames to take prediction in both direction (horizontal and vertical d), and MV-HEVC parameters are used to implement the proposed 2D prediction and rate allocation scheme. A subset of four light field images from Stanford dataset are compressed, using the proposed compression scheme on four bitrates in order to cover the low to high bit-rates scenarios. The comparison is made with state-of-art reference encoder HEVC and its real-time implementation X265. The 17x17 grid is converted into a single pseudo sequence of 289 frames by following the order explained in JPEG Pleno call for proposal and given as input to the both reference schemes. The rate distortion analysis shows that the proposed compression scheme outperforms both reference schemes in all tested bitrate scenarios for all test images. The average BD-PSNR gain is 1.36 dB over HEVC and 2.15 dB over X265.

  • 6.
    Ahmad, Waqas
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Vagharshakyan, Suren
    Tampere University of Technology, Finland.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Gotchev, Atanas
    Tampere University of Technology, Finland.
    Bregovic, Robert
    Tampere University of Technology, Finland.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Shearlet Transform Based Prediction Scheme for Light Field Compression2018Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Light field acquisition technologies capture angular and spatial information ofthe scene. The spatial and angular information enables various post processingapplications, e.g. 3D scene reconstruction, refocusing, synthetic aperture etc at theexpense of an increased data size. In this paper, we present a novel prediction tool forcompression of light field data acquired with multiple camera system. The captured lightfield (LF) can be described using two plane parametrization as, L(u, v, s, t), where (u, v)represents each view image plane coordinates and (s, t) represents the coordinates of thecapturing plane. In the proposed scheme, the captured LF is uniformly decimated by afactor d in both directions (in s and t coordinates), resulting in a sparse set of views alsoreferred to as key views. The key views are converted into a pseudo video sequence andcompressed using high efficiency video coding (HEVC). The shearlet transform basedreconstruction approach, presented in [1], is used at the decoder side to predict thedecimated views with the help of the key views.Four LF images (Truck, Bunny from Stanford dataset, Set2 and Set9 from High DensityCamera Array dataset) are used in the experiments. Input LF views are converted into apseudo video sequence and compressed with HEVC to serve as anchor. Rate distortionanalysis shows the average PSNR gain of 0.98 dB over the anchor scheme. Moreover, inlow bit-rates, the compression efficiency of the proposed scheme is higher compared tothe anchor and on the other hand the performance of the anchor is better in high bit-rates.Different compression response of the proposed and anchor scheme is a consequence oftheir utilization of input information. In the high bit-rate scenario, high quality residualinformation enables the anchor to achieve efficient compression. On the contrary, theshearlet transform relies on key views to predict the decimated views withoutincorporating residual information. Hence, it has inherit reconstruction error. In the lowbit-rate scenario, the bit budget of the proposed compression scheme allows the encoderto achieve high quality for the key views. The HEVC anchor scheme distributes the samebit budget among all the input LF views that results in degradation of the overall visualquality. The sensitivity of human vision system toward compression artifacts in low-bitratecases favours the proposed compression scheme over the anchor scheme.

  • 7. Barkowsky, M.
    et al.
    Wang, Kun
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Cousseau, R.
    Brunnstrom, K.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Le Callet, P.
    Subjective quality assessment of error concealment strategies for 3DTV in the presence of asymmetric transmission errors2010Inngår i: Proceedings of 2010 IEEE 18th International Packet Video Workshop (PV, 2010, s. 193-200Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The transmission of 3DTV sequences over packet based networks may result in degradations of the video quality due to packet loss. In the conventional 2D case, several different strategies are known for extrapolating the missing information and thus concealing the error. In 3D however, the residual error after concealment of one view might leads to binocular rivalry with the correctly received second view. In this paper, three simple alternatives are presented: frame freezing, a reduced playback speed, and displaying only a single view for both eyes, thus effectively switching to 2D presentation. In a subjective experiment the performance in terms of quality of experience of the three methods is evaluated for different packet loss scenarios. Error-free encoded videos at different bit rates have been included as anchor conditions. The subjective experiment method contains special precautions for measuring the Quality of Experience (QoE) for 3D content and also contains an indicator for visual discomfort. The results indicate that switching to 2D is currently the best choice but difficulties with visual discomfort should be expected even for this method.

  • 8.
    Boström, Lena
    et al.
    Mittuniversitetet, Fakulteten för humanvetenskap, Avdelningen för utbildningsvetenskap.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Karlsson, Håkan
    Mittuniversitetet, Fakulteten för humanvetenskap, Avdelningen för utbildningsvetenskap.
    Sundgren, Marcus
    Mittuniversitetet, Fakulteten för humanvetenskap, Avdelningen för utbildningsvetenskap.
    Andersson, Mattias
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Åhlander, Jimmy
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Digital visualisering i skolan: Mittuniversitetets slutrapport från förstudien2018Rapport (Annet vitenskapelig)
    Abstract [sv]

    Den här studiens syfte har varit tvåfaldigt, nämligen att testa alternativa lärmetoder via ett digitalt läromedel i matematik i en kvasiexperimentell studie samt att tillämpa metoder av användarupplevelser för interaktiva visualiseringar, och därigenom öka kunskapen kring hur upplevd kvalitet beror på använd teknik. Pilotstudien sätter också fokus på flera angelägna områden inom skolutveckling både regionalt och nationellt samt viktiga aspekter när det gäller kopplingen teknik, pedagogik och utvärderingsmetoder inom “den tekniska delen”. Det förra handlar om sjunkande matematikresultat i skolan, praktiknära skolforskning, stärkt digital kompetens, visualisering och lärande samt forskning om visualisering och utvärdering. Den senare svarar på frågor om vilka tekniska lösningar som tidigare använts och med vilket syfte har de skapats samt hur visualiseringar har utvärderats enligt läroböcker och i forskningslitteratur.

     

    När det gäller elevernas resultat, en av de stora forskningsfrågorna i studien, så fann vi inga signifikanta skillnader mellan traditionell undervisning och undervisning med visualiseringsläromedlet (3D). Beträffande elevers attityder till matematikmomentet kan konstateras att i kontrollgruppen för årskurs 6 förbättrades attityden signifikans, men inte i klass 8. Gällande flickors och pojkars resultat och attityder kan vi konstatera att flickorna i båda klasserna hade bättre förkunskaper än pojkarna samt att i årskurs 6 var flickorna mer positiva till matematikmomentet än pojkarna i kontrollgruppen. Därutöver kan vi inte skönja några signifikanta skillnader. Andra viktiga rön i studien var att provkonstruktionen inte var optimal samt att tiden för provgenomförande har stor betydelse när på dagen det genomfördes. Andra resultat resultaten i den kvalitativa analysen pekar på positiva attityder och beteenden från eleverna vid arbetet med det visuella läromedlet. Elevernas samarbete och kommunikation förbättrades under lektionerna. Vidare pekade lärarna på att med 3D-läromedlet gavs större möjligheter till att stimulera flera sinnen under lärprocessen. En tydlig slutsats är att 3D-läromedlet är ett viktigt komplement i undervisningen, men kan inte användas helt självt.

     

    Vi kan varken sälla oss till de forskare som anser att 3D-visualisering är överlägset som läromedel för elevers resultat eller till de forskare som varnar för dess effekter för elevers kognitiva överbelastning.  Våra resultat ligger mer i linje med de slutsatser Skolforskningsinstitutet (2017) drar, nämligen att undervisning med digitala läromedel i matematik kan ha positiva effekter, men en lika effektiv undervisning kan möjligen designas på andra sätt. Däremot pekar resultaten i vår studie på ett flertal störningsmoment som kan ha påverkat möjliga resultat och behovet av god teknologin och välutvecklade programvaror.

     

    I studien har vi analyserat resultaten med hjälp av två övergripande ramverk för integrering av teknikstöd i lärande, SAMR och TPACK. Det förra ramverket bidrog med en taxonomi vid diskussionen av hur väl teknikens möjligheter tagits tillvara av läromedel och i läraktiviteter, det senare för en diskussion om de didaktiska frågeställningarna med fokus på teknikens roll. Båda aspekterna är högaktuella med tanke på den ökande digitaliseringen i skolan.

     

    Utifrån tidigare forskning och denna pilotstudie förstår vi att det är viktigt att designa forskningsmetoderna noggrant. En randomisering av grupper vore önskvärt. Prestandamått kan också vara svåra att välja. Tester där personer får utvärdera användbarhet (usability) och användarupplevelse (user experience, UX) baserade på både kvalitativa och kvantitativa metoder blir viktiga för själva användandet av tekniken, men det måste till ytterligare utvärderingar för att koppla tekniken och visualiseringen till kvaliteten i lärandet och undervisningen. Flera metoder behövs således och det blir viktigt med samarbete mellan olika ämnen och discipliner.

  • 9.
    Conti, Caroline
    et al.
    University of Lisbon, Portugal.
    Soares, Luis Ducla
    University of Lisbon, Portugal.
    Nunes, Paulo
    University of Lisbon, Portugal.
    Perra, Cristian
    University of Cagliari, Italy.
    Assunção, Pedro Amado
    Institute de Telecomunicacoes and Politecenico de Leiria, Portugal.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Li, Yun
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Light Field Image Compression2018Inngår i: 3D Visual Content Creation, Coding and Delivery / [ed] Assunção, Pedro Amado, Gotchev, Atanas, Cham: Springer, 2018, s. 143-176Kapittel i bok, del av antologi (Fagfellevurdert)
  • 10.
    Damghanian, Mitra
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Depth and Angular Resolution in Plenoptic Cameras2015Inngår i: 2015 IEEE International Conference On Image Processing (ICIP), September 2015, IEEE, 2015, s. 3044-3048, artikkel-id 7351362Konferansepaper (Fagfellevurdert)
    Abstract [en]

    We present a model-based approach to extract the depth and angular resolution in a plenoptic camera. Obtained results for the depth and angular resolution are validated against Zemax ray tracing results. The provided model-based approach gives the location and number of the resolvable depth planes in a plenoptic camera as well as the angular resolution with regards to disparity in pixels. The provided model-based approach is straightforward compared to practical measurements and can reflect on the plenoptic camera parameters such as the microlens f-number in contrast with the principal-ray-model approach. Easy and accurate quantification of different resolution terms forms the basis for designing the capturing setup and choosing a reasonable system configuration for plenoptic cameras. Results from this work will accelerate customization of the plenoptic cameras for particular applications without the need for expensive measurements.

  • 11.
    Damghanian, Mitra
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Extraction of the lateral resolution in a plenoptic camera using the SPC model2012Inngår i: 2012 International Conference on 3D Imaging, IC3D 2012 - Proceedings, IEEE conference proceedings, 2012, s. Art. no. 6615137-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Established capturing properties like image resolution need to be described thoroughly in complex multidimensional capturing setups such as plenoptic cameras (PC), as these introduce a trade-off between resolution and features such as field of view, depth of field, and signal to noise ratio. Models, methods and metrics that assist exploring and formulating this trade-off are highly beneficial for study as well as design of complex capturing systems. This work presents how the important high-level property lateral resolution is extracted from our previously proposed Sampling Pattern Cube (SPC) model. The SPC carries ray information as well as focal properties of the capturing system it models. The proposed operator extracts the lateral resolution from the SPC model throughout an arbitrary number of depth planes resulting in a depth-resolution profile. We have validated the resolution operator by comparing the achieved lateral resolution with previous results from more simple models and from wave optics based Monte Carlo simulations. The lateral resolution predicted by the SPC model agrees with the results from wave optics based numerical simulations and strengthens the conclusion that the SPC fills the gap between ray-based models and wave optics based models, by including the focal information of the system as a model parameter. The SPC is proven a simple yet efficient model for extracting the depth-based lateral resolution as a high-level property of complex plenoptic capturing system.

  • 12.
    Damghanian, Mitra
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Performance analysis in Lytro camera: Empirical and model based approaches to assess refocusing quality2014Inngår i: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, IEEE conference proceedings, 2014, s. 559-563Konferansepaper (Fagfellevurdert)
    Abstract [en]

    In this paper we investigate the performance of Lytro camera in terms of its refocusing quality. The refocusing quality of the camera is related to the spatial resolution and the depth of field as the contributing parameters. We quantify the spatial resolution profile as a function of depth using empirical and model based approaches. The depth of field is then determined by thresholding the spatial resolution profile. In the model based approach, the previously proposed sampling pattern cube (SPC) model for representation and evaluation of the plenoptic capturing systems is utilized. For the experimental resolution measurements, camera evaluation results are extracted from images rendered by the Lytro full reconstruction rendering method. Results from both the empirical and model based approaches assess the refocusing quality of the Lytro camera consistently, highlighting the usability of the model based approaches for performance analysis of complex capturing systems.

  • 13.
    Damghanian, Mitra
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    The Sampling Pattern Cube: A Representation and Evaluation Tool for Optical Capturing Systems2012Inngår i: Advanced Concepts for Intelligent Vision Systems / [ed] Blanc-Talon, Jacques, Philips, Wilfried, Popescu, Dan, Scheunders, Paul, Zemcík, Pavel, Berlin / Heidelberg: Springer Berlin/Heidelberg, 2012, , s. 12s. 120-131Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Knowledge about how the light field is sampled through a camera system gives the required information to investigate interesting camera parameters. We introduce a simple and handy model to look into the sampling behavior of a camera system. We have applied this model to single lens system as well as plenoptic cameras. We have investigated how camera parameters of interest are interpreted in our proposed model-based representation. This model also enables us to make comparisons between capturing systems or to investigate how variations in an optical capturing system affect its sampling behavior.

  • 14.
    Damghanian, Mitra
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Erdmann, Arne
    Raytrix Gmbh.
    Perwass, Christian
    Raytrix Gmbh.
    Spatial resolution in a multi-focus plenoptic camera2014Inngår i: IEEE International Conference on Image Processing, ICIP 2014, IEEE conference proceedings, 2014, s. 1932-1936, artikkel-id 7025387Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Evaluation of the state of the art plenoptic cameras is necessary for design and application purposes. In this work, spatial resolution is investigated in a multi-focus plenoptic camera using two approaches: empirical and model-based. The Raytrix R29 plenoptic camera is studied which utilizes three types of micro lenses with different focal lengths in a hexagonal array structure to increase the depth of field. The modelbased approach utilizes the previously proposed sampling pattern cube (SPC) model for representation and evaluation of the plenoptic capturing systems. For the experimental resolution measurements, spatial resolution values are extracted from images reconstructed by the provided Raytrix reconstruction method. Both the measurement and the SPC model based approaches demonstrate a gradual variation of the resolution values in a wide depth range for the multi focus R29 camera. Moreover, the good agreement between the results from the model-based approach and those from the empirical approach confirms suitability of the SPC model in evaluating high-level camera parameters such as the spatial resolution in a complex capturing system as R29 multi-focus plenoptic camera.

  • 15.
    Damghanian, Mitra
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Navarro Fructuoso, Hector
    Department of Optics, University of Valencia, Spain.
    Martinez Corral, Manuel
    Department of Optics, University of Valencia, Spain.
    Investigating the lateral resolution in a plenoptic capturing system using the SPC model2013Inngår i: Proceedings of SPIE - The International Society for Optical Engineering: Digital photography IX, SPIE - International Society for Optical Engineering, 2013, s. 86600T-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Complex multidimensional capturing setups such as plenoptic cameras (PC) introduce a trade-off between various system properties. Consequently, established capturing properties, like image resolution, need to be described thoroughly for these systems. Therefore models and metrics that assist exploring and formulating this trade-off are highly beneficial for studying as well as designing of complex capturing systems. This work demonstrates the capability of our previously proposed sampling pattern cube (SPC) model to extract the lateral resolution for plenoptic capturing systems. The SPC carries both ray information as well as focal properties of the capturing system it models. The proposed operator extracts the lateral resolution from the SPC model throughout an arbitrary number of depth planes giving a depth-resolution profile. This operator utilizes focal properties of the capturing system as well as the geometrical distribution of the light containers which are the elements in the SPC model. We have validated the lateral resolution operator for different capturing setups by comparing the results with those from Monte Carlo numerical simulations based on the wave optics model. The lateral resolution predicted by the SPC model agrees with the results from the more complex wave optics model better than both the ray based model and our previously proposed lateral resolution operator. This agreement strengthens the conclusion that the SPC fills the gap between ray-based models and the real system performance, by including the focal information of the system as a model parameter. The SPC is proven a simple yet efficient model for extracting the lateral resolution as a high-level property of complex plenoptic capturing systems.

  • 16.
    Dima, Elijs
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Gao, Yuan
    Institute of Computer Science, Christian-Albrechts University of Kiel, Germany.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Koch, Reinhard
    Institute of Computer Science, Christian-Albrechts University of Kiel, Germany.
    Esquivel, Sandro
    Institute of Computer Science, Christian-Albrechts University of Kiel, Germany.
    Estimation and Post-Capture Compensation of Synchronization Error in Unsynchronized Multi-Camera SystemsManuskript (preprint) (Annet vitenskapelig)
  • 17.
    Dima, Elijs
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Assessment of Multi-Camera Calibration Algorithms for Two-Dimensional Camera Arrays Relative to Ground Truth Position and Direction2016Inngår i: 3DTV-Conference, IEEE Computer Society, 2016, artikkel-id 7548887Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Camera calibration methods are commonly evaluated on cumulative reprojection error metrics, on disparate one-dimensional da-tasets. To evaluate calibration of cameras in two-dimensional arrays, assessments need to be made on two-dimensional datasets with constraints on camera parameters. In this study, accuracy of several multi-camera calibration methods has been evaluated on camera parameters that are affecting view projection the most. As input data, we used a 15-viewpoint two-dimensional dataset with intrinsic and extrinsic parameter constraints and extrinsic ground truth. The assessment showed that self-calibration methods using structure-from-motion reach equal intrinsic and extrinsic parameter estimation accuracy with standard checkerboard calibration algorithm, and surpass a well-known self-calibration toolbox, BlueCCal. These results show that self-calibration is a viable approach to calibrating two-dimensional camera arrays, but improvements to state-of-art multi-camera feature matching are necessary to make BlueCCal as accurate as other self-calibration methods for two-dimensional camera arrays.

  • 18.
    Dima, Elijs
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Modeling Depth Uncertainty of Desynchronized Multi-Camera Systems2017Inngår i: 2017 International Conference on 3D Immersion (IC3D), IEEE, 2017Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Accurately recording motion from multiple perspectives is relevant for recording and processing immersive multi-media and virtual reality content. However, synchronization errors between multiple cameras limit the precision of scene depth reconstruction and rendering. In order to quantify this limit, a relation between camera de-synchronization, camera parameters, and scene element motion has to be identified. In this paper, a parametric ray model describing depth uncertainty is derived and adapted for the pinhole camera model. A two-camera scenario is simulated to investigate the model behavior and how camera synchronization delay, scene element speed, and camera positions affect the system's depth uncertainty. Results reveal a linear relation between synchronization error, element speed, and depth uncertainty. View convergence is shown to affect mean depth uncertainty up to a factor of 10. Results also show that depth uncertainty must be assessed on the full set of camera rays instead of a central subset.

  • 19.
    Dima, Elijs
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Kjellqvist, Martin
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Litwic, Lukasz
    Ericsson AB.
    Zhang, Zhi
    Ericsson AB.
    Rasmusson, Lennart
    Observit AB.
    Flodén, Lars
    Observit AB.
    LIFE: A Flexible Testbed For Light Field Evaluation2018Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Recording and imaging the 3D world has led to the use of light fields. Capturing, distributing and presenting light field data is challenging, and requires an evaluation platform. We define a framework for real-time processing, and present the design and implementation of a light field evaluation system. In order to serve as a testbed, the system is designed to be flexible, scalable, and able to model various end-to-end light field systems. This flexibility is achieved by encapsulating processes and devices in discrete framework systems. The modular capture system supports multiple camera types, general-purpose data processing, and streaming to network interfaces. The cloud system allows for parallel transcoding and distribution of streams. The presentation system encapsulates rendering and display specifics. The real-time ability was tested in a latency measurement; the capture and presentation systems process and stream frames within a 40 ms limit.

  • 20.
    Karlsson, Linda
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Temporal filter with bilinear interpolation for ROI video coding2006Rapport (Annet vitenskapelig)
    Abstract [en]

    In videoconferencing and video over the mobile phone, themain visual information is found within limited regions ofthe video. This enables improved perceived quality byregion-of-interest coding. In this paper we introduce atemporal preprocessing filter that reuses values of theprevious frame, by which changes in the background areonly allowed for every second frame. This reduces the bitrateby 10-25% or gives an increase in average PSNR of0.29-0.98 dB. Further processing of the video sequence isnecessary for an improved re-allocation of the resources.Motion of the ROI causes absence of necessary backgrounddata at the ROI border. We conceal this by using a bilinearinterpolation between the current and previous frame at thetransition from background to ROI. This results in animprovement in average PSNR of 0.44 – 1.05 dB in thetransition area with a minor decrease in average PSNRwithin the ROI.

  • 21.
    Karlsson, Linda
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Spatio-Temporal Filter for ROI Video Coding2006Inngår i: Proceedings of the 14th European Signal Processing Conference (EUSIPCO 2006) Florence, Italy 4-8.Sept. 2006, 2006Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    Reallocating resources within a video sequence to the regions-of-interest increases the perceived quality at limited bandwidths. In this paper we combine a spatial filter with a temporal filter, which are both codec and standard independent. This spatio-temporal filter removes resources from both the motion vectors and the prediction error with a computational complexity lower than the spatial filter by itself. This decreases the bit rate by 30-50% compared to coding the original sequence using H.264. The released bits can be used by the codec to increase the PSNR of the ROI by 1.58 - 4.61 dB, which is larger than for the spatial and temporal filters by themselves.

  • 22.
    Li, Yongwei
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    An analysis of demosaicing for plenoptic capture based on ray optics2018Inngår i: Proceedings of 3DTV Conference 2018, 2018, artikkel-id 8478476Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The plenoptic camera is gaining more and more attention as it capturesthe 4D light field of a scene with a single shot and enablesa wide range of post-processing applications. However, the preprocessing steps for captured raw data, such as demosaicing, have been overlooked. Most existing decoding pipelines for plenoptic cameras still apply demosaicing schemes which are developed for conventional cameras. In this paper, we analyze the sampling pattern of microlens-based plenoptic cameras by ray-tracing techniques and ray phase space analysis. The goal of this work is to demonstrate guidelines and principles for demosaicing the plenoptic captures by taking the unique microlens array design into account. We show that the sampling of the plenoptic camera behaves differently from that of a conventional camera and the desired demosaicing scheme is depth-dependent.

  • 23.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Compression of Unfocused Plenoptic Images using a Displacement Intra prediction2016Inngår i: 2016 IEEE International Conference on Multimedia and Expo Workshop, ICMEW 2016, IEEE Signal Processing Society, 2016, artikkel-id 7574673Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Plenoptic images are one type of light field contents produced by using a combination of a conventional camera and an additional optical component in the form of microlens arrays, which are positioned in front of the image sensor surface. This camera setup can capture a sub-sampling of the light field with high spatial fidelity over a small range, and with a more coarsely sampled angle range. The earliest applications that leverage on the plenoptic image content is image refocusing, non-linear distribution of out-of-focus areas, SNR vs. resolution trade-offs, and 3D-image creation. All functionalities are provided by using post-processing methods. In this work, we evaluate a compression method that we previously proposed for a different type of plenoptic image (focused or plenoptic camera 2.0 contents) than the unfocused or plenoptic camera 1.0 that is used in this Grand Challenge. The method is an extension of the state-of-the-art video compression standard HEVC where we have brought the capability of bi-directional inter-frame prediction into the spatial prediction. The method is evaluated according to the scheme set out by the Grand Challenge, and the results show a high compression efficiency compared with JPEG, i.e., up to 6 dB improvements for the tested images.

  • 24.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    A Scalable Coding Approach for High Quality Depth Image Compression2012Inngår i: 3DTV-Conference, IEEE conference proceedings, 2012, s. Art. no. 6365469-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The distortion by using traditional video encoders (e.g. H.264) on the depth discontinuity can introduce disturbing effects on the synthesized view. The proposed scheme aims at preserving the most significantdepth transition for a better view synthesis. Furthermore, it has a scalable structure. The scheme extracts edge contours from a depth image and represents them by chain code. The chain code and the sampleddepth values on each side of the edge contour are encoded by differential and arithmetic coding. The depthimage is reconstructed by diffusion of edge samples and uniform sub-samples from the low quality depthimage. At low bit rates, the proposed scheme outperforms HEVC intra at the edges in the synthesized views, which correspond to the significant discontinuities in the depth image. The overall quality is also better with the proposed scheme at low bit rates for contents with distinct depth transition. © 2012 IEEE.

  • 25.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Depth Image Post-processing Method by Diffusion2013Inngår i: Proceedings of SPIE-The International Society for Optical Engineering: 3D Image Processing (3DIP) and Applications, SPIE - International Society for Optical Engineering, 2013, s. Art. no. 865003-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Multi-view three-dimensional television relies on view synthesis to reduce the number of views being transmitted.  Arbitrary views can be synthesized by utilizing corresponding depth images with textures. The depth images obtained from stereo pairs or range cameras may contain erroneous values, which entail artifacts in a rendered view. Post-processing of the data may then be utilized to enhance the depth image with the purpose to reach a better quality of synthesized views. We propose a Partial Differential Equation (PDE)-based interpolation method for a reconstruction of the smooth areas in depth images, while preserving significant edges. We modeled the depth image by adjusting thresholds for edge detection and a uniform sparse sampling factor followed by the second order PDE interpolation. The objective results show that a depth image processed by the proposed method can achieve a better quality of synthesized views than the original depth image. Visual inspection confirmed the results.

  • 26.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Depth Map Compression with Diffusion Modes in 3D-HEVC2013Inngår i: MMEDIA 2013 - 5th International Conferences on Advances in Multimedia / [ed] Philip Davies, David Newell, International Academy, Research and Industry Association (IARIA), 2013, s. 125-129Konferansepaper (Fagfellevurdert)
    Abstract [en]

    For three-dimensional television, multiple views can be generated by using the Multi-view Video plus Depth (MVD) format. The depth maps of this format can be compressed efficiently by the 3D extension of High Efficiency Video Coding (3D-HEVC), which has explored the correlations between its two components, texture and associated depth map. In this paper, we introduce two modes for depth map coding into HEVC, where the modes use diffusion. The framework for inter-component prediction of Depth Modeling Modes (DMM) is utilized for the proposed modes. They detect edges from textures and then diffuse an entire block from known adjacent blocks by using Laplace equation constrained by the detected edges. The experimental results show that depth maps can be compressed more efficiently with the proposed diffusion modes, where the bit rate saving can reach 1.25 percentage of the total depth bit rate with a constant quality of synthesized views.

  • 27.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Tourancheau, Sylvain
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Subjective Evaluation of an Edge-based Depth Image Compression Scheme2013Inngår i: Proceedings of SPIE - The International Society for Optical Engineering: Stereoscopic Displays and Applications XXIV, SPIE - International Society for Optical Engineering, 2013, s. Art. no. 86480D-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Multi-view three-dimensional television requires many views, which may be synthesized from two-dimensional images with accompanying pixel-wise depth information. This depth image, which typically consists of smooth areas and sharp transitions at object borders, must be consistent with the acquired scene in order for synthesized views to be of good quality. We have previously proposed a depth image coding scheme that preserves significant edges and encodes smooth areas between these. An objective evaluation considering the structural similarity (SSIM) index for synthesized views demonstrated an advantage to the proposed scheme over the High Efficiency Video Coding (HEVC) intra mode in certain cases. However, there were some discrepancies between the outcomes from the objective evaluation and from our visual inspection, which motivated this study of subjective tests. The test was conducted according to ITU-R BT.500-13 recommendation with Stimulus-comparison methods. The results from the subjective test showed that the proposed scheme performs slightly better than HEVC with statistical significance at majority of the tested bit rates for the given contents.

  • 28.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Coding of plenoptic images by using a sparse set and disparities2015Inngår i: Proceedings - IEEE International Conference on Multimedia and Expo, IEEE conference proceedings, 2015, s. -Art. no. 7177510Konferansepaper (Fagfellevurdert)
    Abstract [en]

    A focused plenoptic camera not only captures the spatial information of a scene but also the angular information. The capturing results in a plenoptic image consisting of multiple microlens images and with a large resolution. In addition, the microlens images are similar to their neighbors. Therefore, an efficient compression method that utilizes this pattern of similarity can reduce coding bit rate and further facilitate the usage of the images. In this paper, we propose an approach for coding of focused plenoptic images by using a representation, which consists of a sparse plenoptic image set and disparities. Based on this representation, a reconstruction method by using interpolation and inpainting is devised to reconstruct the original plenoptic image. As a consequence, instead of coding the original image directly, we encode the sparse image set plus the disparity maps and use the reconstructed image as a prediction reference to encode the original image. The results show that the proposed scheme performs better than HEVC intra with more than 5 dB PSNR or over 60 percent bit rate reduction.

  • 29.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Coding of focused plenoptic contents by displacement intra prediction2016Inngår i: IEEE transactions on circuits and systems for video technology (Print), ISSN 1051-8215, E-ISSN 1558-2205, Vol. 26, nr 7, s. 1308-1319, artikkel-id 7137669Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    A light field is commonly described by a two-plane representation with four dimensions. Refocused three-dimensional contents can be rendered from light field images. A method for capturing these images is by using cameras with microlens arrays. A dense sampling of the light field results in large amounts of redundant data. Therefore, an efficient compression is vital for a practical use of these data. In this paper, we propose a displacement intra prediction scheme with a maximum of two hypotheses for the compression of plenoptic contents from focused plenoptic cameras. The proposed scheme is further implemented into HEVC. The work is aiming at coding plenoptic captured contents efficiently without knowing underlying camera geometries. In addition, the theoretical analysis of the displacement intra prediction for plenoptic images is explained; the relationship between the compressed captured images and their rendered quality is also analyzed. Evaluation results show that plenoptic contents can be efficiently compressed by the proposed scheme. Bit rate reduction up to 60 percent over HEVC is obtained for plenoptic images, and more than 30 percent is achieved for the tested video sequences.

  • 30.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Efficient Intra Prediction Scheme For Light Field Image Compression2014Inngår i: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, IEEE conference proceedings, 2014, s. Art. no. 6853654-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Interactive photo-realistic graphics can be rendered by using light field datasets. One way of capturing the dataset is by using light field cameras with microlens arrays. The captured images contain repetitive patterns resulted from adjacent mi-crolenses. These images don't resemble the appearance of a natural scene. This dissimilarity leads to problems in light field image compression by using traditional image and video encoders, which are optimized for natural images and video sequences. In this paper, we introduce the full inter-prediction scheme in HEVC into intra-prediction for the compression of light field images. The proposed scheme is capable of performing both unidirectional and bi-directional prediction within an image. The evaluation results show that above 3 dB quality improvements or above 50 percent bit-rate saving can be achieved in terms of BD-PSNR for the proposed scheme compared to the original HEVC intra-prediction for light field images.

  • 31.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Scalable coding of plenoptic images by using a sparse set and disparities2016Inngår i: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 25, nr 1, s. 80-91, artikkel-id 7321029Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    One of the light field capturing techniques is the focused plenoptic capturing. By placing a microlens array in front of the photosensor, the focused plenoptic cameras capture both spatial and angular information of a scene in each microlens image and across microlens images. The capturing results in significant amount of redundant information, and the captured image is usually of a large resolution. A coding scheme that removes the redundancy before coding can be of advantage for efficient compression, transmission and rendering. In this paper, we propose a lossy coding scheme to efficiently represent plenoptic images. The format contains a sparse image set and its associated disparities. The reconstruction is performed by disparity-based interpolation and inpainting, and the reconstructed image is later employed as a prediction reference for the coding of the full plenoptic image. As an outcome of the representation, the proposed scheme inherits a scalable structure with three layers.The results show that plenoptic images are compressed efficiently with over 60 percent bit rate reduction compared to HEVC intra, and with over 20 percent compared to HEVC block copying mode.

  • 32.
    Muddala, Suryanarayana M.
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Disocclusion Handling Using Depth-Based Inpainting2013Inngår i: Proceedings of MMEDIA 2013, The Fifth InternationalConferences on Advances in Multimedia, Venice, Italy, 2013, International Academy, Research and Industry Association (IARIA), 2013, s. 136-141Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Depth image based rendering (DIBR) plays an important role in producing virtual views using 3D-video formats such as video plus depth (V+D) and multi view-videoplus-depth (MVD). Pixel regions with non-defined values (due to disoccluded areas) are exposed when DIBR is used. In this paper, we propose a depth-based inpainting method aimed to handle Disocclusions in DIBR from V+D and MVD. Our proposed method adopts the curvature driven diffusion (CDD) model as a data term, to which we add a depth constraint. In addition, we add depth to further guide a directional priority term in the exemplar based texture synthesis. Finally, we add depth in the patch-matching step to prioritize background texture when inpainting. The proposed method is evaluated by comparing inpainted virtual views with corresponding views produced by three state-of-the-art inpainting methods as references. The evaluation shows the proposed method yielding an increased objective quality compared to the reference methods, and visual inspection further indicate an improved visual quality.

  • 33.
    Muddala, Suryanarayana M.
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Virtual View Synthesis Using Layered Depth Image Generation and Depth-Based Inpainting for Filling Disocclusions and Translucent Disocclusions2016Inngår i: Journal of Visual Communication and Image Representation, ISSN 1047-3203, E-ISSN 1095-9076, Vol. 38, s. 351-366Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    View synthesis is an efficient solution to produce content for 3DTV and FTV. However, proper handling of the disocclusions is a major challenge in the view synthesis. Inpainting methods offer solutions for handling disocclusions, though limitations in foreground-background classification causes the holes to be filled with inconsistent textures. Moreover, the state-of-the art methods fail to identify and fill disocclusions in intermediate distances between foreground and background through which background may be visible in the virtual view (translucent disocclusions). Aiming at improved rendering quality, we introduce a layered depth image (LDI) in the original camera view, in which we identify and fill occluded background so that when the LDI data is rendered to a virtual view, no disocclusions appear but views with consistent data are produced also handling translucent disocclusions. Moreover, the proposed foreground-background classification and inpainting fills the disocclusions with neighboring background texture consistently. Based on the objective and subjective evaluations, the proposed method outperforms the state-of-the art methods at the disocclusions.

  • 34.
    Muddala, Suryanarayana Murthy
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Depth-Included Curvature Inpainting for Disocclusion Filling in View Synthesis2013Inngår i: International Journal On Advances in Telecommunications, ISSN 1942-2601, E-ISSN 1942-2601, Vol. 6, nr 3&4, s. 132-142Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Depth-image-based-rendering (DIBR) is the commonly used for generating additional views for 3DTV and FTV using 3D video formats such as video plus depth (V+D) and multi view-video-plus-depth (MVD). The synthesized views suffer from artifacts mainly with disocclusions when DIBR is used. Depth-based inpainting methods can solve these problems plausibly. In this paper, we analyze the influence of the depth information at various steps of the depth-included curvature inpainting method. The depth-based inpainting method relies on the depth information at every step of the inpainting process: boundary extraction for missing areas, data term computation for structure propagation and in the patch matching to find best data. The importance of depth at each step is evaluated using objective metrics and visual comparison. Our evaluation demonstrates that depth information in each step plays a key role. Moreover, to what degree depth can be used in each step of the inpainting process depends on the depth distribution.

  • 35.
    Muddala, Suryanarayana Murthy
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Depth-Based Inpainting For Disocclusion Filling2014Inngår i: 3DTV-Conference, IEEE Computer Society, 2014, s. Art. no. 6874752-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Depth-based inpainting methods can solve disocclusion problems occurring in depth-image-based rendering. However, inpainting in this context suffers from artifacts along foreground objects due to foreground pixels in the patch matching. In this paper, we address the disocclusion problem by a refined depth-based inpainting method. The novelty is in classifying the foreground and background by using available local depth information. Thereby, the foreground information is excluded from both the source region and the target patch. In the proposed inpainting method, the local depth constraints imply inpainting only the background data and preserving the foreground object boundaries. The results from the proposed method are compared with those from the state-of-the art inpainting methods. The experimental results demonstrate improved objective quality and a better visual quality along the object boundaries.

  • 36.
    Muddala, Suryanarayana Murthy
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Edge-preserving depth-image-based rendering method2012Inngår i: 2012 International Conference on 3D Imaging, IC3D 2012 - Proceedings, 2012, s. Art. no. 6615113-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Distributionof future 3DTV is likely to use supplementary depth information to a videosequence. New virtual views may then be rendered in order to adjust todifferent 3D displays. All depth-imaged-based rendering (DIBR) methods sufferfrom artifacts in the resulting images, which are corrected by differentpost-processing. The proposed method is based on fundamental principles of3D-warping. The novelty lies in how the virtual view sample values are obtainedfrom one-dimensional interpolation, where edges are preserved by introducing specificedge-pixels with information about both foreground and background data. Thisavoids fully the post-processing of filling cracks and holes. We comparedrendered virtual views of our method and of the View Synthesis ReferenceSoftware (VSRS) and analyzed the results based on typical artifacts. Theproposed method obtained better quality for photographic images and similarquality for synthetic images.

  • 37.
    Muddala, Suryanarayana Murthy
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Tourancheau, Sylvain
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Edge-aided virtual view rendering for multiview video plus depth2013Inngår i: Proceedings of SPIE Volume 8650, Burlingame, CA, USA, 2013: 3D Image Processing (3DIP) and Applications 2013, SPIE - International Society for Optical Engineering, 2013, s. Art. no. 86500E-Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    Depth-Image-Based Rendering (DIBR) of virtual views is a fundamental method in three dimensional 3-D video applications to produce dierent perspectives from texture and depth information, in particular the multi-viewplus-depth (MVD) format. Artifacts are still present in virtual views as a consequence of imperfect rendering using existing DIBR methods. In this paper, we propose an alternative DIBR method for MVD. In the proposed method we introduce an edge pixel and interpolate pixel values in the virtual view using the actual projected coordinates from two adjacent views, by which cracks and disocclusions are automatically lled. In particular, we propose a method to merge pixel information from two adjacent views in the virtual view before the interpolation; we apply a weighted averaging of projected pixels within the range of one pixel in the virtual view. We compared virtual view images rendered by the proposed method to the corresponding view images rendered by state-of-theart methods. Objective metrics demonstrated an advantage of the proposed method for most investigated media contents. Subjective test results showed preference to dierent methods depending on media content, and the test could not demonstrate a signicant dierence between the proposed method and state-of-the-art methods.

  • 38.
    Muddala, Suryanarayana
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Spatio-Temporal Consistent Depth-Image Based Rendering Using Layered Depth Image and Inpainting2016Inngår i: EURASIP Journal on Image and Video Processing, ISSN 1687-5176, E-ISSN 1687-5281, Vol. 9, nr 1, s. 1-19Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Depth-image-based rendering (DIBR) is a commonly used method for synthesizing additional views using video-plus-depth (V+D) format. A critical issue with DIBR based view synthesis is the lack of information behind foreground objects. This lack is manifested as disocclusions, holes, next to the foreground objects in rendered virtual views as a consequence of the virtual camera “seeing” behind the foreground object. The disocclusions are larger in the extrapolation case, i.e. the single camera case. Texture synthesis methods (inpainting methods) aim to fill these disocclusions by producing plausible texture content. However, virtual views inevitably exhibit both spatial and temporal inconsistencies at the filled disocclusion areas, depending on the scene content. In this paper we propose a layered depth image (LDI) approach that improves the spatio-temporal consistency. In the process of LDI generation, depth information is used to classify the foreground and background in order to form a static scene sprite from a set of neighboring frames. Occlusions in the LDI are then identified and filled using inpainting, such that no disocclusions appear when the LDI data is rendered to a virtual view. In addition to the depth information, optical flow is computed to extract the stationary parts of the scene and to classify the occlusions in the inpainting process. Experimental results demonstrate that spatio-temporal inconsistencies are significantly reduced using the proposed method. Furthermore, subjective and objective qualities are improved compared to state-of-the-art reference methods.

  • 39.
    Navarro, Hector
    et al.
    Department of Optics, University of Valencia, E-46100 Burjassot, Spain.
    Saavedra, Genaro
    Department of Optics, University of Valencia, E-46100 Burjassot, Spain.
    Martinez-Corral, Manuel
    Department of Optics, University of Valencia, E-46100 Burjassot, Spain.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för data- och systemvetenskap.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för data- och systemvetenskap.
    Depth-of-field enhancement in integral imaging by selective depth-deconvolution2014Inngår i: IEEE/OSA Journal of Display Technology, ISSN 1551-319X, E-ISSN 1558-9323, Vol. 10, nr 3, s. 182-188Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    One of the major drawbacks of integral imaging technique is its limited depth of field. Such limitation is imposed by the numerical aperture of the microlenses. In this paper we propose a method to extend the depth of field of integral imaging systems in the reconstruction stage. The method is based on the combination of deconvolution tools and depth filtering of each elemental image using disparity map information. We demonstrate our proposal presenting digital reconstructions of a 3D scene focused at different depths with extended depth of field.

  • 40.
    Navarro-Fructuoso, Hector
    et al.
    Dept. of Optics, Univ. of Valencia, E-46100, Burjassot, Spain .
    Saavedra-Tortosa, G.
    Dept. of Optics, Univ. of Valencia, E-46100, Burjassot, Spain .
    Martinez-Corral, Manuel
    Dept. of Optics, Univ. of Valencia, E-46100, Burjassot, Spain .
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Extended depth-of-field in integral imaging by depth-dependent deconvolution2013Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Integral Imaging is a technique to obtain true color 3D images that can provide full and continuous motion parallax for several viewers. The depth of field of these systems is mainly limited by the numerical aperture of each lenslet of the microlens array. A digital method has been developed to increase the depth of field of Integral Imaging systems in the reconstruction stage. By means of the disparity map of each elemental image, it is possible to classify the objects of the scene according to their distance from the microlenses and apply a selective deconvolution for each depth of the scene. Topographical reconstructions with enhanced depth of field of a 3D scene are presented to support our proposal.

  • 41.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Empirical rate-distortion analysis of JPEG 2000 3D and H.264/AVC coded integral imaging based 3D-images2008Inngår i: 2008 3DTV Conference - True Vision - Capture, Transmission and Display of 3D Video, IEEE conference proceedings, 2008, s. 93-96Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Novel camera systems producing 3D-images containing light direction in addition to light intensity is emerging. Integral imaging (II) is a technique on which many of these systems rely. The pictures produced by these cameras (II-pictures) are space-requiring in terms of data storage compared to their 2D counterparts. This paper investigates how coding the II-pictures using H.264/AVC and JPEG 2000 Part 10 (JP3D) affect the images in terms of rate-distortion as well as introduced coding artifacts. A set of four reference images are coded using a number of pre-processing and encoding variants, so called coding schemes. For low bitrates (<0.5bpp) the H.264/AVC-based coding schemes have higher coding efficiency, which asymptotically level of at higher bitrates in favor of JP3D. The JP3D coded 3D-images show less spread in quality than H.264/AVC, when quality is evaluated using PSNR as a function of viewing angle. However, the distortion induced by H.264/AVC is primarily localized to object boarders within the 3D-image, which in initial tests appear less visible than the JP3D coding artifacts that spread out evenly over the image. Extensive subjective tests will be performed in future work to further support the presented results.

  • 42.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Synthesis, Coding, and Evaluation of 3D Images Based on Integral Imaging2008Doktoravhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    In recent years camera prototypes based on Integral Imaging (II) have emerged that are capable of capturing three-dimensional (3D) images. When being viewed on a 3D display, these II-pictures convey depth and content that realistically change perspective as the viewer changes the viewing position.

    The dissertation concentrates on three restraining factors concerning II-picture progress. Firstly, there is a lack of digital II-pictures available for inter alia comparative research and coding scheme development. Secondly, there is an absence of objective quality metrics that explicitly measure distortion with respect to the II-picture properties: depth and view-angle dependency. Thirdly, low coding efficiencies are achieved when present image coding standards are applied to II-pictures.

    A computer synthesis method has been developed, which enables the production of different II-picture types. An II-camera model forms a basis and is combined with a scene description language that allows for the describing of arbitrary complex virtual scenes. The light transport within the scene and into the II-camera is simulated using ray-tracing and geometrical optics. A number of II-camera models, scene descriptions, and II-pictures are produced using the presented method.

    Two quality evaluation metrics have been constructed to objectively quantify the distortion contained in an II-picture with respect to its specific properties. The first metric models how the distortion is perceived by a viewer watching an II-display from different viewing-angles. The second metric estimates the depth-distribution of the distortion. New aspects of coding-induced artifacts within the II-picture are revealed using the proposed metrics.

    Finally, a coding scheme for II-pictures has been developed that inter alia utilizes the video coding standard H.264/AVC by firstly transforming the II-picture into a pseudo video sequence. The properties of the coding scheme have been studied in detail and compared with other coding schemes using the proposed evaluation metrics. The proposed coding scheme achieves the same quality as JPEG2000 at approximately 1/60th of the storage- or distribution requirements.

  • 43.
    Olsson, Roger
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Adhikarla, Vamsi Kiran
    Blekinge Institute of Technology, Karlskrona, Sweden.
    Schwarz, Sebastian
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Converting conventional stereo pairs to multi-view sequences using morphing2012Inngår i: Proceedings of SPIE - The International Society for Optical Engineering, SPIE - International Society for Optical Engineering, 2012, s. Art. no. 828828-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Autostereoscopic multi view displays require multiple views of a scene to provide motion parallax. When an observer changes viewing angle different stereoscopic pairs are perceived. This allows new perspectives of the scene to be seen giving a more realistic 3D experience. However, capturing arbitrary number of views is at best cumbersome, and in some occasions impossible. Conventional stereo video (CSV) operates on two video signals captured using two cameras at two different perspectives. Generation and transmission of two views is more feasible than that of multiple views. It would be more efficient if multiple views required by an autostereoscopic display can be synthesized from these sparse set of views. This paper addresses the conversion of stereoscopic video to multiview video using the video effect morphing. Different morphing algorithms are implemented and evaluated. Contrary to traditional conversion methods, these algorithms disregard the physical depth explicitly and instead generate intermediate views using sparse sets of correspondence features and image morphing. A novel morphing algorithm is also presented that uses scale invariant feature transform (SIFT) and segmentation to construct robust correspondences features and qualitative intermediate views. All algorithms are evaluated on a subjective and objective basis and the comparison results are presented.

  • 44.
    Olsson, Roger
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Andersson, Håkan
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    A modular cross-platform GPU-based approach for flexible 3D video playback2011Inngår i: Proceedings of SPIE - The International Society for Optical Engineering, SPIE - International Society for Optical Engineering, 2011, s. Art. no. 78631E-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Different compression formats for stereo and multiview based 3D video is being standardized and software players capable of decoding and presenting these formats onto different display types is a vital part in the commercialization and evolution of 3D video. However, the number of publicly available software video players capable of decoding and playing multiview 3D video is still quite limited. This paper describes the design and implementation of a GPU-based real-time 3D video playback solution, built on top of cross-platform, open source libraries for video decoding and hardware accelerated graphics. A software architecture is presented that efficiently process and presents high definition 3D video in real-time and in a flexible manner support both current 3D video formats and emerging standards. Moreover, a set of bottlenecks in the processing of 3D video content in a GPU-based real-time 3D video playback solution is identified and discussed.

  • 45.
    Olsson, Roger
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    A depth dependent quality metric for evaluation of coded integral imaging based 3d-images2007Inngår i: Proceedings of 3DTV Conference, New York: IEEE conference proceedings, 2007, s. 403-406Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The 2D quality metric Peak-Signal-To-Noise-Ratio (PSNR) is often used when evaluating the quality of a coding schemes for integral imaging (II) based 3D-images. Either by applying the PSNR to the full II resulting in an accumulate quality metric for all possible views. Or by applying it to extracted sub-images, which results in a viewing angle dependent metric. However, both of these approaches fail to capture a coding scheme's distribution of artifacts at different depths within the 3D-image. In this paper we propose a quality metric that evaluates the quality of the 3D-image at different depths, which results in a 1D quality line. First, we introduce the metric and the operations that are used for its evaluation. Then the used experimental setup to evaluate the metric is presented. Finally the Second, it is evaluated on a set of IIs, coded using four different coding schemes. The preliminary results indicate a strong correlation with the coding artifacts that are visible at different depths.

  • 46.
    Olsson, Roger
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    A novel quality metric for evaluation of depth dependent coding artifacts in 3D images2008Inngår i: Stereoscopic Displays and Applications XIX, SPIE - International Society for Optical Engineering, 2008, s. 80307-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The two-dimensional quality metric Peak-Sional-To-Noise-Ratio (PSNR) is often used to evaluate the quality of coding schemes for different types of light field based 3D-images, e.g. integral imaging or multi-view. The metric results in a single accumulated quality value for the whole 3D-image. Evaluating single views - seen from specific viewing angles gives a quality matrix that present the 3D-image quality as a function of viewing angle. However, these two approaches do not capture all aspects of the induced distortion in a coded 3D-image. We have previously shown coding schemes of similar kind for which coding artifacts are distributed differently with respect to the 3D-image's depth. In this paper we propose a novel metric that captures the depth distribution of coding-induced distortion. Each element in the resulting quality vector corresponds to the quality at a specific depth. First we introduce the proposed full-reference metric and the operations on which it is based. Second, the experimental setup is presented. Finally, the metric is evaluated on a set of differently coded 3D-images and the results are compared, both with previously proposed quality metrics and with visual inspection.

  • 47.
    Olsson, Roger
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Multiview image coding scheme transformations: artifact characteristics and effects on perceived 3D quality2010Inngår i: Stereoscopic Displays and Applications XXI 2010, SPIE - International Society for Optical Engineering, 2010Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Compression schemes for 3D images and -video gain much of their efficiency from transformations that convert the signal into forms suitable for quantization. The achieved compression efficiency is principally determined by rate-distortion analysis using objective quality evaluation metrics. In 2D, quality evaluation metrics operating in the pixel domain implicitly assumes an ideal display modelled as a unity transformation. Similar simplifications are not feasible in 3D analysis and different coding schemes introduce significantly different compression artefacts even though operating at the same rate-distortion ratio.

     

    In this paper we have performed a subjective assessment of the quality of compressed 3D images presented on an autostereoscopic display. In the qualitative part of the assessment different properties of the induced coding artefacts was identified with respect to image depth, pixelation, and zero-parallax distortion. The quantitative part was conducted using a group of non-expert observers that assessed the 3D quality.

    In the results we show how the compression schemes introduce specific groups of artefacts manifesting with significantly different characteristics. In addition, each characteristic is derived from the transformation domains and the relationships between coding scheme and distortion property are presented. Moreover, the characteristics are related to the image quality assessment produced by the observation group.

  • 48.
    Olsson, Roger
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Pseudo video sequence based coding of still 3D integral imagesManuskript (Annet vitenskapelig)
  • 49.
    Olsson, Roger
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Xu, Youshi
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Evaluation of Combined Pre-processing and H.264-compression Schemes for 3D Integral Images2006Rapport (Annet vitenskapelig)
  • 50.
    Olsson, Roger
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Xu, Youzhi
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    A combined pre-processing and H.264-compression scheme for 3D integral images2006Inngår i: Proceedings International Conference on Image Processing, Vols 1-7, Atlanta, GA: IEEE , 2006, s. 513-516Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The next evolutionary step in enhancing video communication fidelity is taken by adding scene depth. 3D video using integral imaging (II) is widely considered as the technique able to take this step. However, an increase in spatial resolution of several orders of magnitude from todays 2D video is required to provide a sufficient depth fidelity. In this paper we propose a pre-processing method that aims to enhance the compression efficiency of integral images. We first transform a still integral image into a pseudo video sequence which is then compressed using a H.264 video encoder. The improvement in compression efficiency of using this combination of pre-processing and compression is evaluated and presented. An average PSNR increase of 6.9 dB or more, compared to JPEG2000, is observed on a set of reference images.

12 1 - 50 of 71
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf