miun.sePublikasjoner
Endre søk
Begrens søket
1 - 43 of 43
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Treff pr side
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
  • Standard (Relevans)
  • Forfatter A-Ø
  • Forfatter Ø-A
  • Tittel A-Ø
  • Tittel Ø-A
  • Type publikasjon A-Ø
  • Type publikasjon Ø-A
  • Eldste først
  • Nyeste først
  • Skapad (Eldste først)
  • Skapad (Nyeste først)
  • Senast uppdaterad (Eldste først)
  • Senast uppdaterad (Nyeste først)
  • Disputationsdatum (tidligste først)
  • Disputationsdatum (siste først)
Merk
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Andersson, Håkan
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    3D Video Playback: A modular cross-platform GPU-based approach for flexible multi-view 3D video rendering2010Independent thesis Basic level (professional degree), 10 poäng / 15 hpOppgave
    Abstract [en]

    The evolution of depth‐perception visualization technologies, emerging format standardization work and research within the field of multi‐view 3D video and imagery addresses the need for flexible 3D video visualization. The wide variety of available 3D‐display types and visualization techniques for multi‐view video, as well as the high throughput requirements for high definition video, addresses the need for a real‐time 3D video playback solution that takes advantage of hardware accelerated graphics, while providing a high degree of flexibility through format configuration and cross‐platform interoperability. A modular component based software solution based on FFmpeg for video demultiplexing and video decoding is proposed,using OpenGL and GLUT for hardware accelerated graphics and POSIX threads for increased CPU utilization. The solution has been verified to have sufficient throughput in order to display 1080p video at the native video frame rate on the experimental system, which is considered as a standard high‐end desktop PC only using commercial hardware. In order to evaluate the performance of the proposed solution a number of throughput evaluation metrics have been introduced measuring average frame rate as a function of: video bit rate, video resolution and number of views. The results obtained have indicated that the GPU constitutes the primary bottleneck in a multi‐view lenticular rendering system and that multi‐view rendering performance is degraded as the number of views is increased. This is a result of the current GPU square matrix texture cache architectures, resulting in texture lookup access times according to random memory access patterns when the number of views is high. The proposed solution has been identified in order to provide low CPU efficiency, i.e. low CPU hardware utilization and it is recommended to increase performance by investigating the gains of scalable multithreading techniques. It is also recommended to investigate the gains of introducing video frame buffering in video memory or to move more calculations to the CPU in order to increase GPU performance.

  • 2.
    Brunnström, Kjell
    et al.
    RISE AB Acreo.
    Barkowsky, Marcus
    Deggendorf Institute of Technology (DIT), University of Applied Sciences, Deggendorf.
    Statistical quality of experience analysis: on planning the sample size and statistical significance testing2018Inngår i: Journal of Electronic Imaging (JEI), ISSN 1017-9909, E-ISSN 1560-229X, Vol. 27, nr 5, s. 053013-1-053013-11, artikkel-id 053013Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    This paper analyzes how an experimenter can balance errors in subjective video quality tests betweenthe statistical power of finding an effect if it is there and not claiming that an effect is there if the effect is not there,i.e., balancing Type I and Type II errors. The risk of committing Type I errors increases with the number ofcomparisons that are performed in statistical tests. We will show that when controlling for this and at thesame time keeping the power of the experiment at a reasonably high level, it is unlikely that the number oftest subjects that are normally used and recommended by the International Telecommunication Union (ITU),i.e., 15 is sufficient but the number used by the Video Quality Experts Group (VQEG), i.e., 24 is more likelyto be sufficient. Examples will also be given for the influence of Type I error on the statistical significance ofcomparing objective metrics by correlation. We also present a comparison between parametric and nonparametricstatistics. The comparison targets the question whether we would reach different conclusions on the statisticaldifference between the video quality ratings of different video clips in a subjective test, based on thecomparison between the student T-test and the Mann–Whitney U-test. We found that there was hardly a differencewhen few comparisons are compensated for, i.e., then almost the same conclusions are reached. Whenthe number of comparisons is increased, then larger and larger differences between the two methods arerevealed. In these cases, the parametric T-test gives clearly more significant cases, than the nonparametrictest, which makes it more important to investigate whether the assumptions are met for performing a certaintest.

  • 3. Carle, Fredrik
    et al.
    Koptioug, Andrei
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för teknik och hållbar utveckling.
    Portable rescue device and a method for locating such a device2007Patent (Annet (populærvitenskap, debatt, mm))
    Abstract [en]

    A portable rescue device and a method for locating, by means of a first rescue device set in a search mode, a second rescue device set in a distress mode. In the method, a distress signal carrying a device identification is received from said second rescue device. A first bearing and a second bearing to the second rescue device are obtained. The first and second bearings are taken from a first and a second position, respectively. A distance between these positions is determined. A current distance and a current bearing to the second rescue device are determined on basis of the first and second bearings and the distance. The current bearing and the current distance are communicated to a user of the first rescue device. The portable rescue device is used for performing the method and for that purpose it includes a first communication unit for distress signal transmission and reception; a compass; a processor; a user interface; and a mode switch for switching between a search mode and a distress signal mode. The first communication device has an antenna structure that provides directional capability.

  • 4.
    Comstedt, Erik
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Effect of additional compression features on h.264 surveillance video2017Independent thesis Basic level (degree of Bachelor), 10 poäng / 15 hpOppgave
    Abstract [en]

    In video surveillance business, a recurring topic of discussion is quality versus data usage. A higher quality allows for more details to be captured at the cost of a higher bit rate, and for cameras monitoring events 24 hours a day, limiting data usage can quickly become a factor to consider. The purpose of this thesis has been to apply additional compression features to a h.264 video steam, and evaluate their effects on the videos overall quality. Using a surveillance camera, recordings of video streams were obtained. These recordings had constant GOP and frame rates. By breaking down one of these videos to an image sequence, it was possible to encode the image sequence into video streams with variable GOP/FPS using the software Ffmpeg. Additionally a user test was performed on these video streams, following the DSCQS standard from the ITU-R recom- mendation. The participants had to subjectively determine the quality of video streams. The results from the these tests showed that the participants did not no- tice any considerable difference in quality between the normal videos and the videos with variable GOP/FPS. Based of these results, the thesis has shown that that additional compression features can be applied to h.264 surveillance streams, without having a substantial effect on the video streams overall quality.

  • 5.
    Damghanian, Mitra
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    The Sampling Pattern Cube: A Framework for Representation and Evaluation of Plenoptic Capturing Systems2013Licentiatavhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    Digital cameras have already entered our everyday life. Rapid technological advances have made it easier and cheaper to develop new cameras with unconventional structures. The plenoptic camera is one of the new devices which can capture the light information which is then able to be processed for applications such as focus adjustments. The high level camera properties, such as the spatial or angular resolution are required to evaluate and compare plenoptic cameras. With complex camera structures that introduce trade-offs between various high level camera properties, it is no longer straightforward to describe and extract these properties. Proper models, methods and metrics with the desired level of details are beneficial to describe and evaluate plenoptic camera properties.

    This thesis attempts to describe and evaluate camera properties using a model based representation of plenoptic capturing systems in favour of a unified language. The SPC model is proposed and it describes which light samples from the scene are captured by the camera system. Light samples in the SPC model carry the ray and focus information of the capturing setup. To demonstrate the capabilities of the introduced model, property extractors for lateral resolution are defined and evaluated. The lateral resolution values obtained from the introduced model are compared with the results from the ray-based model and the ground truth data. The knowledge about how to generate and visualize the proposed model and how to extract the camera properties from the model based representation of the capturing system is collated to form the SPC framework.

    The main outcomes of the thesis can be summarized in the following points: A model based representation of the light sampling behaviour of the plenoptic capturing system is introduced, which incorporates the focus information as well as the ray information. A framework is developed to generate the SPC model and to extract high level properties of the plenoptic capturing system. Results confirm that the SPC model is capable of describing the light sampling behaviour of the capturing system, and that the SPC framework is capable of extracting high level camera properties with a higher descriptive level as compared to the ray-based model. The results from the proposed model compete with those from the more elaborate wave optics model in the ranges that wave nature of the light is not dominant. The outcome of the thesis can benefit design, evaluation and comparison of the complex capturing systems.

  • 6.
    Damghanian, Mitra
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    The Sampling Pattern Cube: A Representation and Evaluation Tool for Optical Capturing Systems2012Inngår i: Advanced Concepts for Intelligent Vision Systems / [ed] Blanc-Talon, Jacques, Philips, Wilfried, Popescu, Dan, Scheunders, Paul, Zemcík, Pavel, Berlin / Heidelberg: Springer Berlin/Heidelberg, 2012, , s. 12s. 120-131Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Knowledge about how the light field is sampled through a camera system gives the required information to investigate interesting camera parameters. We introduce a simple and handy model to look into the sampling behavior of a camera system. We have applied this model to single lens system as well as plenoptic cameras. We have investigated how camera parameters of interest are interpreted in our proposed model-based representation. This model also enables us to make comparisons between capturing systems or to investigate how variations in an optical capturing system affect its sampling behavior.

  • 7.
    Damghanian, Mitra
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Erdmann, Arne
    Raytrix Gmbh.
    Perwass, Christian
    Raytrix Gmbh.
    Spatial resolution in a multi-focus plenoptic camera2014Inngår i: IEEE International Conference on Image Processing, ICIP 2014, IEEE conference proceedings, 2014, s. 1932-1936, artikkel-id 7025387Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Evaluation of the state of the art plenoptic cameras is necessary for design and application purposes. In this work, spatial resolution is investigated in a multi-focus plenoptic camera using two approaches: empirical and model-based. The Raytrix R29 plenoptic camera is studied which utilizes three types of micro lenses with different focal lengths in a hexagonal array structure to increase the depth of field. The modelbased approach utilizes the previously proposed sampling pattern cube (SPC) model for representation and evaluation of the plenoptic capturing systems. For the experimental resolution measurements, spatial resolution values are extracted from images reconstructed by the provided Raytrix reconstruction method. Both the measurement and the SPC model based approaches demonstrate a gradual variation of the resolution values in a wide depth range for the multi focus R29 camera. Moreover, the good agreement between the results from the model-based approach and those from the empirical approach confirms suitability of the SPC model in evaluating high-level camera parameters such as the spatial resolution in a complex capturing system as R29 multi-focus plenoptic camera.

  • 8.
    Damghanian, Mitra
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Navarro Fructuoso, Hector
    Department of Optics, University of Valencia, Spain.
    Martinez Corral, Manuel
    Department of Optics, University of Valencia, Spain.
    Investigating the lateral resolution in a plenoptic capturing system using the SPC model2013Inngår i: Proceedings of SPIE - The International Society for Optical Engineering: Digital photography IX, SPIE - International Society for Optical Engineering, 2013, s. 86600T-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Complex multidimensional capturing setups such as plenoptic cameras (PC) introduce a trade-off between various system properties. Consequently, established capturing properties, like image resolution, need to be described thoroughly for these systems. Therefore models and metrics that assist exploring and formulating this trade-off are highly beneficial for studying as well as designing of complex capturing systems. This work demonstrates the capability of our previously proposed sampling pattern cube (SPC) model to extract the lateral resolution for plenoptic capturing systems. The SPC carries both ray information as well as focal properties of the capturing system it models. The proposed operator extracts the lateral resolution from the SPC model throughout an arbitrary number of depth planes giving a depth-resolution profile. This operator utilizes focal properties of the capturing system as well as the geometrical distribution of the light containers which are the elements in the SPC model. We have validated the lateral resolution operator for different capturing setups by comparing the results with those from Monte Carlo numerical simulations based on the wave optics model. The lateral resolution predicted by the SPC model agrees with the results from the more complex wave optics model better than both the ray based model and our previously proposed lateral resolution operator. This agreement strengthens the conclusion that the SPC fills the gap between ray-based models and the real system performance, by including the focal information of the system as a model parameter. The SPC is proven a simple yet efficient model for extracting the lateral resolution as a high-level property of complex plenoptic capturing systems.

  • 9.
    Dima, Elijs
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Assessment of Multi-Camera Calibration Algorithms for Two-Dimensional Camera Arrays Relative to Ground Truth Position and Direction2016Inngår i: 3DTV-Conference, IEEE Computer Society, 2016, artikkel-id 7548887Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Camera calibration methods are commonly evaluated on cumulative reprojection error metrics, on disparate one-dimensional da-tasets. To evaluate calibration of cameras in two-dimensional arrays, assessments need to be made on two-dimensional datasets with constraints on camera parameters. In this study, accuracy of several multi-camera calibration methods has been evaluated on camera parameters that are affecting view projection the most. As input data, we used a 15-viewpoint two-dimensional dataset with intrinsic and extrinsic parameter constraints and extrinsic ground truth. The assessment showed that self-calibration methods using structure-from-motion reach equal intrinsic and extrinsic parameter estimation accuracy with standard checkerboard calibration algorithm, and surpass a well-known self-calibration toolbox, BlueCCal. These results show that self-calibration is a viable approach to calibrating two-dimensional camera arrays, but improvements to state-of-art multi-camera feature matching are necessary to make BlueCCal as accurate as other self-calibration methods for two-dimensional camera arrays.

  • 10.
    Dima, Elijs
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informationssystem och -teknologi.
    Modeling Depth Uncertainty of Desynchronized Multi-Camera Systems2017Inngår i: 2017 International Conference on 3D Immersion (IC3D), IEEE, 2017Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Accurately recording motion from multiple perspectives is relevant for recording and processing immersive multi-media and virtual reality content. However, synchronization errors between multiple cameras limit the precision of scene depth reconstruction and rendering. In order to quantify this limit, a relation between camera de-synchronization, camera parameters, and scene element motion has to be identified. In this paper, a parametric ray model describing depth uncertainty is derived and adapted for the pinhole camera model. A two-camera scenario is simulated to investigate the model behavior and how camera synchronization delay, scene element speed, and camera positions affect the system's depth uncertainty. Results reveal a linear relation between synchronization error, element speed, and depth uncertainty. View convergence is shown to affect mean depth uncertainty up to a factor of 10. Results also show that depth uncertainty must be assessed on the full set of camera rays instead of a central subset.

  • 11.
    Jonsson, Patrik
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Dobslaw, Felix
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Decision support system for variable speed regulation2012Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The problem of recommending a suitable speed limit for roads is important for road authorities in order to increase traffic safety. Nowadays, these speed limits can be given more dynamically, with digital speed regulation signs. The challenge here is input from the environment, in combination with probabilities for certain events. Here we present a decision support model based on a dynamic Bayesian network. The purpose of this model is to predict the appropriate speed on the basis of weather data, traffic density and road maintenance activities. The dynamic Bayesian network principle of using uncertainty for the involved variables gives a possibility to model the real conditions. This model shows that it is possible to develop automated decision support systems for variable speed regulation.

  • 12.
    Karlsson, Linda Sofia
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    A Spatio-Temporal Filter for Region-of-Interest Video CodingManuskript (preprint) (Annet vitenskapelig)
    Abstract [en]

    Region of interest (ROI) video coding increases the quality in regions interesting to the viewer at the expense of quality in the background. This enables a high perceived quality at low bit rates. A successfully detected ROI can be used to control the bit-allocation in the encoding. In this paper we present a filter that is independent of codec and standard. It is applied in both the spatial and the temporal domains. The filter’s ability to reduce the number of bits necessary to encode the background is analyzed theoretically and where these bits are re-allocated. The computational complexity of the algorithms is also determined. The quality is evaluated using PSNR of the ROI and subjective tests. Test showed that the spatio-temporal filter has a better coding efficiency than using only spatial or only temporal filtering. The filter successfully re-allocates the bits from the background to the foreground.

  • 13.
    Karlsson, Linda Sofia
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Layer assignment based on depth data distribution for multiview-plus-depth scalable video coding2011Inngår i: IEEE transactions on circuits and systems for video technology (Print), ISSN 1051-8215, E-ISSN 1558-2205, Vol. 21, nr 6, s. 742-754Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Three dimensional (3D) video is experiencing a rapid growth in a number of areas including 3D cinema, 3DTV and mobile phones. Several problems must to be addressed to display captured 3D video at another location. One problem is how torepresent the data. The multiview plus depth representation of a scene requires a lower bit rate than transmitting all views required by an application and provides more information than a 2D-plus-depth sequence. Another problem is how to handle transmission in a heterogeneous network. Scalable video coding enables adaption of a 3D video sequence to the conditions at the receiver. In this paper we present a scheme that combines scalability based on the position in depth of the data and the distance to the center view. The general scheme preserves the center view data, whereas the data of the remaining views are extracted in enhancement layers depending on distance to the viewer and the center camera. The data is assigned into enhancement layers within a view based on depth data distribution. Strategies concerning the layer assignment between adjacent views are proposed. In general each extracted enhancement layer increases the visual quality and PSNR compared to only using center view data. The bit-rate per layer can be further decreased if depth data is distributed over the enhancement layers. The choice of strategy to assign layers between adjacent views depends on whether quality of the fore-most objects in the scene or the quality of the views close to the center is important.

  • 14.
    Li, Yun
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Coding of three-dimensional video content: Depth image coding by diffusion2013Licentiatavhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    Three-dimensional (3D) movies in theaters have become a massive commercial success during recent years, and it is likely that, with the advancement of display technologies and the production of 3D contents, TV broadcasting in 3D will play an important role in home entertainments in the not too distant future. 3D video contents contain at least two views from different perspectives for the left and the right eye of viewers. The amount of coded information is doubled if these views are encoded separately. Moreover, for multi-view displays (i.e. different perspectives of a scene in 3D are presented to the viewer at the same time through different angles), either video streams of all the required views must be transmitted to the receiver, or the displays must synthesize the missing views with a subset of the views. The latter approach has been widely proposed to reduce the amount of data being transmitted. The virtual views can be synthesized by the Depth Image Based Rendering (DIBR) approach from textures and associated depth images. However it is still the case that the amount of information for the textures plus the depths presents a significant challenge for the network transmission capacity. An efficient compression will, therefore, increase the availability of content access and provide a better video quality under the same network capacity constraints.

    In this thesis, the compression of depth images is addressed. These depth images can be assumed as being piece-wise smooth. Starting from the properties of depth images, a novel depth image model based on edges and sparse samples is presented, which may also be utilized for depth image post-processing. Based on this model, a depth image coding scheme that explicitly encodes the locations of depth edges is proposed, and the coding scheme has a scalable structure. Furthermore, a compression scheme for block-based 3D-HEVC is also devised, in which diffusion is used for intra prediction. In addition to the proposed schemes, the thesis illustrates several evaluation methodologies, especially, the subjective test of the stimulus-comparison method. It is suitable for evaluating the quality of two impaired images, as the objective metrics are inaccurate with respect to synthesized views.

    The MPEG test sequences were used for the evaluation. The results showed that virtual views synthesized from post-processed depth images by using the proposed model are better than those synthesized from original depth images. More importantly, the proposed coding schemes using such a model produced better synthesized views than the state of the art schemes. As a result, the outcome of the thesis can lead to a better quality of 3DTV experience.

  • 15.
    Li, Yun
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Coding of Three-dimensional Video Content: Diffusion-based Coding of Depth Images and Displacement Intra-Coding of Plenoptic Contents2015Doktoravhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    In recent years, the three-dimensional (3D) movie industry has reaped massive commercial success in the theaters. With the advancement of display technologies, more experienced capturing and generation of 3D contents, TV broadcasting, movies, and games in 3D have entered home entertainment, and it is likely that 3D applications will play an important role in many aspects of people's life in a not distant future. 3D video contents contain at least two views from different perspectives for the left and the right eye of viewers. The amount of coded information is doubled if these views are encoded separately. Moreover, for multi-view displays (i.e. different perspectives of a scene in 3D are presented to the viewer at the same time through different angles), either video streams of all the required views must be transmitted to the receiver, or the displays must synthesize the missing views with a subset of the views. The latter approach has been widely proposed to reduce the amount of data being transmitted and make data adjustable to 3D-displays. The virtual views can be synthesized by the Depth Image Based Rendering (DIBR) approach from textures and associated depth images. However, it is still the case that the amount of information for the textures plus the depths presents a significant challenge for the network transmission capacity. Compression techniques are vital to facilitate the transmission. In addition to multi-view and multi-view plus depth for reproducing 3D, light field techniques have recently become a hot topic. The light field capturing aims at acquiring not only spatial but also angular information of a view, and an ideal light field rendering device should be such that the viewers would perceive it as looking through a window. Thus, the light field techniques are a step forward to provide us with a more authentic perception of 3D. Among many light field capturing approaches, focused plenoptic capturing is a solution that utilize microlens arrays. The plenoptic cameras are also portable and commercially available. Multi-view and refocusing can be obtained during post-production from these cameras. However, the captured plenoptic images are of a large size and contain significant amount of a redundant information. An efficient compression of the above mentioned contents will, therefore, increase the availability of content access and provide a better quality experience under the same network capacity constraints. In this thesis, the compression of depth images and of plenoptic contents captured by focused plenoptic cameras are addressed. The depth images can be assumed to be piece-wise smooth. Starting from the properties of depth images, a novel depth image model based on edges and sparse samples is presented, which may also be utilized for depth image post-processing. Based on this model, a depth image coding scheme that explicitly encodes the locations of depth edges is proposed, and the coding scheme has a scalable structure. Furthermore, a compression scheme for block-based 3D-HEVC is also devised, in which diffusion is used for intra prediction. In addition to the proposed schemes, the thesis illustrates several evaluation methodologies, especially the subjective test of the stimulus-comparison method. This is suitable for evaluating the quality of two impaired images, as the objective metrics are inaccurate with respect to synthesized views. For the compression of plenoptic contents, displacement intra prediction with more than one hypothesis is applied and implemented in the HEVC for an efficient prediction. In addition, a scalable coding approach utilizing a sparse set and disparities is introduced for the coding of focused plenoptic images. The MPEG test sequences were used for the evaluation of the proposed depth image compression, and public available plenoptic image and video contents were applied to the assessment of the proposed plenoptic compression. For depth image coding, the results showed that virtual views synthesized from post-processed depth images by using the proposed model are better than those synthesized from original depth images. More importantly, the proposed coding schemes using such a model produced better synthesized views than the state of the art schemes. For the plenoptic contents, the proposed scheme achieved an efficient prediction and reduced the bit rate significantly while providing coding and rendering scalability. As a result, the outcome of the thesis can lead to improving quality of the 3DTV experience and facilitate the development of 3D applications in general.

  • 16.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Depth Image Post-processing Method by Diffusion2013Inngår i: Proceedings of SPIE-The International Society for Optical Engineering: 3D Image Processing (3DIP) and Applications, SPIE - International Society for Optical Engineering, 2013, s. Art. no. 865003-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Multi-view three-dimensional television relies on view synthesis to reduce the number of views being transmitted.  Arbitrary views can be synthesized by utilizing corresponding depth images with textures. The depth images obtained from stereo pairs or range cameras may contain erroneous values, which entail artifacts in a rendered view. Post-processing of the data may then be utilized to enhance the depth image with the purpose to reach a better quality of synthesized views. We propose a Partial Differential Equation (PDE)-based interpolation method for a reconstruction of the smooth areas in depth images, while preserving significant edges. We modeled the depth image by adjusting thresholds for edge detection and a uniform sparse sampling factor followed by the second order PDE interpolation. The objective results show that a depth image processed by the proposed method can achieve a better quality of synthesized views than the original depth image. Visual inspection confirmed the results.

  • 17.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Tourancheau, Sylvain
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Subjective Evaluation of an Edge-based Depth Image Compression Scheme2013Inngår i: Proceedings of SPIE - The International Society for Optical Engineering: Stereoscopic Displays and Applications XXIV, SPIE - International Society for Optical Engineering, 2013, s. Art. no. 86480D-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Multi-view three-dimensional television requires many views, which may be synthesized from two-dimensional images with accompanying pixel-wise depth information. This depth image, which typically consists of smooth areas and sharp transitions at object borders, must be consistent with the acquired scene in order for synthesized views to be of good quality. We have previously proposed a depth image coding scheme that preserves significant edges and encodes smooth areas between these. An objective evaluation considering the structural similarity (SSIM) index for synthesized views demonstrated an advantage to the proposed scheme over the High Efficiency Video Coding (HEVC) intra mode in certain cases. However, there were some discrepancies between the outcomes from the objective evaluation and from our visual inspection, which motivated this study of subjective tests. The test was conducted according to ITU-R BT.500-13 recommendation with Stimulus-comparison methods. The results from the subjective test showed that the proposed scheme performs slightly better than HEVC with statistical significance at majority of the tested bit rates for the given contents.

  • 18.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Coding of plenoptic images by using a sparse set and disparities2015Inngår i: Proceedings - IEEE International Conference on Multimedia and Expo, IEEE conference proceedings, 2015, s. -Art. no. 7177510Konferansepaper (Fagfellevurdert)
    Abstract [en]

    A focused plenoptic camera not only captures the spatial information of a scene but also the angular information. The capturing results in a plenoptic image consisting of multiple microlens images and with a large resolution. In addition, the microlens images are similar to their neighbors. Therefore, an efficient compression method that utilizes this pattern of similarity can reduce coding bit rate and further facilitate the usage of the images. In this paper, we propose an approach for coding of focused plenoptic images by using a representation, which consists of a sparse plenoptic image set and disparities. Based on this representation, a reconstruction method by using interpolation and inpainting is devised to reconstruct the original plenoptic image. As a consequence, instead of coding the original image directly, we encode the sparse image set plus the disparity maps and use the reconstructed image as a prediction reference to encode the original image. The results show that the proposed scheme performs better than HEVC intra with more than 5 dB PSNR or over 60 percent bit rate reduction.

  • 19.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Coding of focused plenoptic contents by displacement intra prediction2016Inngår i: IEEE transactions on circuits and systems for video technology (Print), ISSN 1051-8215, E-ISSN 1558-2205, Vol. 26, nr 7, s. 1308-1319, artikkel-id 7137669Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    A light field is commonly described by a two-plane representation with four dimensions. Refocused three-dimensional contents can be rendered from light field images. A method for capturing these images is by using cameras with microlens arrays. A dense sampling of the light field results in large amounts of redundant data. Therefore, an efficient compression is vital for a practical use of these data. In this paper, we propose a displacement intra prediction scheme with a maximum of two hypotheses for the compression of plenoptic contents from focused plenoptic cameras. The proposed scheme is further implemented into HEVC. The work is aiming at coding plenoptic captured contents efficiently without knowing underlying camera geometries. In addition, the theoretical analysis of the displacement intra prediction for plenoptic images is explained; the relationship between the compressed captured images and their rendered quality is also analyzed. Evaluation results show that plenoptic contents can be efficiently compressed by the proposed scheme. Bit rate reduction up to 60 percent over HEVC is obtained for plenoptic images, and more than 30 percent is achieved for the tested video sequences.

  • 20.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Efficient Intra Prediction Scheme For Light Field Image Compression2014Inngår i: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, IEEE conference proceedings, 2014, s. Art. no. 6853654-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Interactive photo-realistic graphics can be rendered by using light field datasets. One way of capturing the dataset is by using light field cameras with microlens arrays. The captured images contain repetitive patterns resulted from adjacent mi-crolenses. These images don't resemble the appearance of a natural scene. This dissimilarity leads to problems in light field image compression by using traditional image and video encoders, which are optimized for natural images and video sequences. In this paper, we introduce the full inter-prediction scheme in HEVC into intra-prediction for the compression of light field images. The proposed scheme is capable of performing both unidirectional and bi-directional prediction within an image. The evaluation results show that above 3 dB quality improvements or above 50 percent bit-rate saving can be achieved in terms of BD-PSNR for the proposed scheme compared to the original HEVC intra-prediction for light field images.

  • 21.
    Li, Yun
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Jennehag, Ulf
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Scalable coding of plenoptic images by using a sparse set and disparities2016Inngår i: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 25, nr 1, s. 80-91, artikkel-id 7321029Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    One of the light field capturing techniques is the focused plenoptic capturing. By placing a microlens array in front of the photosensor, the focused plenoptic cameras capture both spatial and angular information of a scene in each microlens image and across microlens images. The capturing results in significant amount of redundant information, and the captured image is usually of a large resolution. A coding scheme that removes the redundancy before coding can be of advantage for efficient compression, transmission and rendering. In this paper, we propose a lossy coding scheme to efficiently represent plenoptic images. The format contains a sparse image set and its associated disparities. The reconstruction is performed by disparity-based interpolation and inpainting, and the reconstructed image is later employed as a prediction reference for the coding of the full plenoptic image. As an outcome of the representation, the proposed scheme inherits a scalable structure with three layers.The results show that plenoptic images are compressed efficiently with over 60 percent bit rate reduction compared to HEVC intra, and with over 20 percent compared to HEVC block copying mode.

  • 22.
    Muddala, Suryanarayana M.
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Disocclusion Handling Using Depth-Based Inpainting2013Inngår i: Proceedings of MMEDIA 2013, The Fifth InternationalConferences on Advances in Multimedia, Venice, Italy, 2013, International Academy, Research and Industry Association (IARIA), 2013, s. 136-141Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Depth image based rendering (DIBR) plays an important role in producing virtual views using 3D-video formats such as video plus depth (V+D) and multi view-videoplus-depth (MVD). Pixel regions with non-defined values (due to disoccluded areas) are exposed when DIBR is used. In this paper, we propose a depth-based inpainting method aimed to handle Disocclusions in DIBR from V+D and MVD. Our proposed method adopts the curvature driven diffusion (CDD) model as a data term, to which we add a depth constraint. In addition, we add depth to further guide a directional priority term in the exemplar based texture synthesis. Finally, we add depth in the patch-matching step to prioritize background texture when inpainting. The proposed method is evaluated by comparing inpainted virtual views with corresponding views produced by three state-of-the-art inpainting methods as references. The evaluation shows the proposed method yielding an increased objective quality compared to the reference methods, and visual inspection further indicate an improved visual quality.

  • 23.
    Muddala, Suryanarayana Murthy
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Free View Rendering for 3D Video: Edge-Aided Rendering and Depth-Based Image Inpainting2015Doktoravhandling, monografi (Annet vitenskapelig)
    Abstract [en]

    Three Dimensional Video (3DV) has become increasingly popular with the success of 3D cinema. Moreover, emerging display technology offers an immersive experience to the viewer without the necessity of any visual aids such as 3D glasses. 3DV applications, Three Dimensional Television (3DTV) and Free Viewpoint Television (FTV) are auspicious technologies for living room environments by providing immersive experience and look around facilities. In order to provide such an experience, these technologies require a number of camera views captured from different viewpoints. However, the capture and transmission of the required number of views is not a feasible solution, and thus view rendering is employed as an efficient solution to produce the necessary number of views. Depth-image-based rendering (DIBR) is a commonly used rendering method. Although DIBR is a simple approach that can produce the desired number of views, inherent artifacts are major issues in the view rendering. Despite much effort to tackle the rendering artifacts over the years, rendered views still contain visible artifacts.

    This dissertation addresses three problems in order to improve 3DV quality: 1) How to improve the rendered view quality using a direct approach without dealing each artifact specifically. 2) How to handle disocclusions (a.k.a. holes) in the rendered views in a visually plausible manner using inpainting. 3) How to reduce spatial inconsistencies in the rendered view. The first problem is tackled by an edge-aided rendering method that uses a direct approach with one-dimensional interpolation, which is applicable when the virtual camera distance is small. The second problem is addressed by using a depth-based inpainting method in the virtual view, which reconstructs the missing texture with background data at the disocclusions. The third problem is undertaken by a rendering method that firstly inpaint occlusions as a layered depth image (LDI) in the original view, and then renders a spatially consistent virtual view.

    Objective assessments of proposed methods show improvements over the state-of-the-art rendering methods. Visual inspection shows slight improvements for intermediate views rendered from multiview videos-plus-depth, and the proposed methods outperforms other view rendering methods in the case of rendering from single view video-plus-depth. Results confirm that the proposed methods are capable of reducing rendering artifacts and producing spatially consistent virtual views.

    In conclusion, the view rendering methods proposed in this dissertation can support the production of high quality virtual views based on a limited number of input views. When used to create a multi-scopic presentation, the outcome of this dissertation can benefit 3DV technologies to improve the immersive experience.

  • 24.
    Muddala, Suryanarayana Murthy
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Depth-Included Curvature Inpainting for Disocclusion Filling in View Synthesis2013Inngår i: International Journal On Advances in Telecommunications, ISSN 1942-2601, E-ISSN 1942-2601, Vol. 6, nr 3&4, s. 132-142Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Depth-image-based-rendering (DIBR) is the commonly used for generating additional views for 3DTV and FTV using 3D video formats such as video plus depth (V+D) and multi view-video-plus-depth (MVD). The synthesized views suffer from artifacts mainly with disocclusions when DIBR is used. Depth-based inpainting methods can solve these problems plausibly. In this paper, we analyze the influence of the depth information at various steps of the depth-included curvature inpainting method. The depth-based inpainting method relies on the depth information at every step of the inpainting process: boundary extraction for missing areas, data term computation for structure propagation and in the patch matching to find best data. The importance of depth at each step is evaluated using objective metrics and visual comparison. Our evaluation demonstrates that depth information in each step plays a key role. Moreover, to what degree depth can be used in each step of the inpainting process depends on the depth distribution.

  • 25.
    Muddala, Suryanarayana Murthy
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Depth-Based Inpainting For Disocclusion Filling2014Inngår i: 3DTV-Conference, IEEE Computer Society, 2014, s. Art. no. 6874752-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Depth-based inpainting methods can solve disocclusion problems occurring in depth-image-based rendering. However, inpainting in this context suffers from artifacts along foreground objects due to foreground pixels in the patch matching. In this paper, we address the disocclusion problem by a refined depth-based inpainting method. The novelty is in classifying the foreground and background by using available local depth information. Thereby, the foreground information is excluded from both the source region and the target patch. In the proposed inpainting method, the local depth constraints imply inpainting only the background data and preserving the foreground object boundaries. The results from the proposed method are compared with those from the state-of-the art inpainting methods. The experimental results demonstrate improved objective quality and a better visual quality along the object boundaries.

  • 26.
    Muddala, Suryanarayana Murthy
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Edge-preserving depth-image-based rendering method2012Inngår i: 2012 International Conference on 3D Imaging, IC3D 2012 - Proceedings, 2012, s. Art. no. 6615113-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Distributionof future 3DTV is likely to use supplementary depth information to a videosequence. New virtual views may then be rendered in order to adjust todifferent 3D displays. All depth-imaged-based rendering (DIBR) methods sufferfrom artifacts in the resulting images, which are corrected by differentpost-processing. The proposed method is based on fundamental principles of3D-warping. The novelty lies in how the virtual view sample values are obtainedfrom one-dimensional interpolation, where edges are preserved by introducing specificedge-pixels with information about both foreground and background data. Thisavoids fully the post-processing of filling cracks and holes. We comparedrendered virtual views of our method and of the View Synthesis ReferenceSoftware (VSRS) and analyzed the results based on typical artifacts. Theproposed method obtained better quality for photographic images and similarquality for synthetic images.

  • 27.
    Muddala, Suryanarayana Murthy
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Tourancheau, Sylvain
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Edge-aided virtual view rendering for multiview video plus depth2013Inngår i: Proceedings of SPIE Volume 8650, Burlingame, CA, USA, 2013: 3D Image Processing (3DIP) and Applications 2013, SPIE - International Society for Optical Engineering, 2013, s. Art. no. 86500E-Konferansepaper (Annet vitenskapelig)
    Abstract [en]

    Depth-Image-Based Rendering (DIBR) of virtual views is a fundamental method in three dimensional 3-D video applications to produce dierent perspectives from texture and depth information, in particular the multi-viewplus-depth (MVD) format. Artifacts are still present in virtual views as a consequence of imperfect rendering using existing DIBR methods. In this paper, we propose an alternative DIBR method for MVD. In the proposed method we introduce an edge pixel and interpolate pixel values in the virtual view using the actual projected coordinates from two adjacent views, by which cracks and disocclusions are automatically lled. In particular, we propose a method to merge pixel information from two adjacent views in the virtual view before the interpolation; we apply a weighted averaging of projected pixels within the range of one pixel in the virtual view. We compared virtual view images rendered by the proposed method to the corresponding view images rendered by state-of-theart methods. Objective metrics demonstrated an advantage of the proposed method for most investigated media contents. Subjective test results showed preference to dierent methods depending on media content, and the test could not demonstrate a signicant dierence between the proposed method and state-of-the-art methods.

  • 28.
    Muddala, Suryanarayana
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Spatio-Temporal Consistent Depth-Image Based Rendering Using Layered Depth Image and Inpainting2016Inngår i: EURASIP Journal on Image and Video Processing, ISSN 1687-5176, E-ISSN 1687-5281, Vol. 9, nr 1, s. 1-19Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Depth-image-based rendering (DIBR) is a commonly used method for synthesizing additional views using video-plus-depth (V+D) format. A critical issue with DIBR based view synthesis is the lack of information behind foreground objects. This lack is manifested as disocclusions, holes, next to the foreground objects in rendered virtual views as a consequence of the virtual camera “seeing” behind the foreground object. The disocclusions are larger in the extrapolation case, i.e. the single camera case. Texture synthesis methods (inpainting methods) aim to fill these disocclusions by producing plausible texture content. However, virtual views inevitably exhibit both spatial and temporal inconsistencies at the filled disocclusion areas, depending on the scene content. In this paper we propose a layered depth image (LDI) approach that improves the spatio-temporal consistency. In the process of LDI generation, depth information is used to classify the foreground and background in order to form a static scene sprite from a set of neighboring frames. Occlusions in the LDI are then identified and filled using inpainting, such that no disocclusions appear when the LDI data is rendered to a virtual view. In addition to the depth information, optical flow is computed to extract the stationary parts of the scene and to classify the occlusions in the inpainting process. Experimental results demonstrate that spatio-temporal inconsistencies are significantly reduced using the proposed method. Furthermore, subjective and objective qualities are improved compared to state-of-the-art reference methods.

  • 29.
    Navarro, Hector
    et al.
    Department of Optics, University of Valencia, E-46100 Burjassot, Spain.
    Saavedra, Genaro
    Department of Optics, University of Valencia, E-46100 Burjassot, Spain.
    Martinez-Corral, Manuel
    Department of Optics, University of Valencia, E-46100 Burjassot, Spain.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för data- och systemvetenskap.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för data- och systemvetenskap.
    Depth-of-field enhancement in integral imaging by selective depth-deconvolution2014Inngår i: IEEE/OSA Journal of Display Technology, ISSN 1551-319X, E-ISSN 1558-9323, Vol. 10, nr 3, s. 182-188Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    One of the major drawbacks of integral imaging technique is its limited depth of field. Such limitation is imposed by the numerical aperture of the microlenses. In this paper we propose a method to extend the depth of field of integral imaging systems in the reconstruction stage. The method is based on the combination of deconvolution tools and depth filtering of each elemental image using disparity map information. We demonstrate our proposal presenting digital reconstructions of a 3D scene focused at different depths with extended depth of field.

  • 30.
    Navarro-Fructuoso, Hector
    et al.
    Dept. of Optics, Univ. of Valencia, E-46100, Burjassot, Spain .
    Saavedra-Tortosa, G.
    Dept. of Optics, Univ. of Valencia, E-46100, Burjassot, Spain .
    Martinez-Corral, Manuel
    Dept. of Optics, Univ. of Valencia, E-46100, Burjassot, Spain .
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Extended depth-of-field in integral imaging by depth-dependent deconvolution2013Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Integral Imaging is a technique to obtain true color 3D images that can provide full and continuous motion parallax for several viewers. The depth of field of these systems is mainly limited by the numerical aperture of each lenslet of the microlens array. A digital method has been developed to increase the depth of field of Integral Imaging systems in the reconstruction stage. By means of the disparity map of each elemental image, it is possible to classify the objects of the scene according to their distance from the microlenses and apply a selective deconvolution for each depth of the scene. Topographical reconstructions with enhanced depth of field of a 3D scene are presented to support our proposal.

  • 31.
    Qureshi, Kamran
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för elektronikkonstruktion.
    Pedestrian Detection on FPGA2014Independent thesis Advanced level (degree of Master (One Year)), 20 poäng / 30 hpOppgave
    Abstract [en]

    Image processing emerges from the curiosity of human vision. To translate, what we see in everyday life and how we differentiate between objects, to robotic vision is a challenging and modern research topic. This thesis focuses on detecting a pedestrian within a standard format of an image. The efficiency of the algorithm is observed after its implementation in FPGA. The algorithm for pedestrian detection was developed using MATLAB as a base. To detect a pedestrian, a histogram of oriented gradient (HOG) of an image was computed. Study indicates that HOG is unique for different objects within an image. The HOG of a series of images was computed to train a binary classifier. A new image was then fed to the classifier in order to test its efficiency. Within the time frame of the thesis, the algorithm was partially translated to a hardware description using VHDL as a base descriptor. The proficiency of the hardware implementation was noted and the result exported to MATLAB for further processing. A hybrid model was created, in which the pre-processing steps were computed in FPGA and a classification performed in MATLAB. The outcome of the thesis shows that HOG is a very efficient and effective way to classify and differentiate different objects within an image. Given its efficiency, this algorithm may even be extended to video.

  • 32.
    Schwarz, Sebastian
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Depth Map Upscaling for Three-Dimensional Television: The Edge-Weighted Optimization Concept2012Licentiatavhandling, med artikler (Annet vitenskapelig)
    Abstract [en]

    With the recent comeback of three-dimensional (3D) movies to the cinemas, there have been increasing efforts to spread the commercial success of 3D to new markets. The possibility of a 3D experience at home, such as three-dimensional television (3DTV), has generated a great deal of interest within the research and standardization community.

    A central issue for 3DTV is the creation and representation of 3D content. Scene depth information plays a crucial role in all parts of the distribution chain from content capture via transmission to the actual 3D display. This depth information is transmitted in the form of depth maps and is accompanied by corresponding video frames, i.e. for Depth Image Based Rendering (DIBR) view synthesis. Nonetheless, scenarios do exist for which the original spatial resolutions of depth maps and video frames do not match, e.g. sensor driven depth capture or asymmetric 3D video coding. This resolution discrepancy is a problem, since DIBR requires accordance between the video frame and depth map. A considerable amount of research has been conducted into ways to match low-resolution depth maps to high resolution video frames. Many proposed solutions utilize corresponding texture information in the upscaling process, however they mostly fail to review this information for validity.

    In the strive for better 3DTV quality, this thesis presents the Edge-Weighted Optimization Concept (EWOC), a novel texture-guided depth upscaling application that addresses the lack of information validation. EWOC uses edge information from video frames as guidance in the depth upscaling process and, additionally, confirms this information based on the original low resolution depth. Over the course of four publications, EWOC is applied in 3D content creation and distribution. Various guidance sources, such as different color spaces or texture pre-processing, are investigated. An alternative depth compression scheme, based on depth map upscaling, is proposed and extensions for increased visual quality and computational performance are presented in this thesis. EWOC was evaluated and compared with competing approaches, with the main focus was consistently on the visual quality of rendered 3D views. The results show an increase in both objective and subjective visual quality to state-of-the-art depth map upscaling methods. This quality gain motivates the choice of EWOC in applications affected by low resolution depth.

    In the end, EWOC can improve 3D content generation and distribution, enhancing the 3D experience to boost the commercial success of 3DTV.

  • 33.
    Schwarz, Sebastian
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Depth Sensing for 3DTV: A Survey2013Inngår i: IEEE Multimedia, ISSN 1070-986X, E-ISSN 1941-0166, Vol. 20, nr 4, s. 10-17Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    In the context of 3D video systems, depth information could be used to render a scene from additional viewpoints. Although there have been many recent advances in this area, including the introduction of the Microsoft Kinect sensor, the robust acquisition of such information continues to be a challenge. This article reviews three depth-sensing approaches for 3DTV. The authors discuss several approaches for acquiring depth information and provides a comparative analysis of their characteristics.

  • 34.
    Schwarz, Sebastian
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Tourancheau, Sylvain
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Adaptive depth filtering for HEVC 3D video coding2012Inngår i: 2012 Picture Coding Symposium, PCS 2012, Proceedings, IEEE conference proceedings, 2012, s. 49-52Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Consumer interest in 3D television (3DTV) is growing steadily, but current available 3D displays still need additional eye-wear and suffer from the limitation of a single stereo view pair. So it can be assumed that auto-stereoscopic multiview displays are the next step in 3D-at-home entertainment, since these displays can utilize the Multiview Video plus Depth (MVD) format to synthesize numerous viewing angles from only a small set of given input views. This motivates efficient MVD compression as an important keystone for commercial success of 3DTV. In this paper we concentrate on the compression of depth information in an MVD scenario. There have been several publications suggesting depth down- and upsampling to increase coding efficiency. We follow this path, using our recently introduced Edge Weighted Optimization Concept (EWOC) for depth upscaling. EWOC uses edge information from the video frame in the upscaling process and allows the use of sparse, non-uniformly distributed depth values. We exploit this fact to expand the depth down-/upsampling idea with an adaptive low-pass filter, reducing high energy parts in the original depth map prior to subsampling and compression. Objective results show the viability of our approach for depth map compression with up-to-date High-Efficiency Video Coding (HEVC). For the same Y-PSNR in synthesized views we achieve up to 18.5% bit rate decrease compared to full-scale depth and around 10% compared to competing depth down-/upsampling solutions. These results were confirmed by a subjective quality assessment, showing a statistical significant preference for 87.5% of the test cases.

  • 35.
    Schwarz, Sebastian
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    A Weighted Optimization Approach to Time-of-Flight Sensor Fusion2014Inngår i: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 23, nr 1, s. 214-225Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Acquiring scenery depth is a fundamental task in computer vision, with many applications in manufacturing, surveillance, or robotics relying on accurate scenery information. Time-of-flight cameras can provide depth information in real-time and overcome short-comings of traditional stereo analysis. However, they provide limited spatial resolution and sophisticated upscaling algorithms are sought after. In this paper, we present a sensor fusion approach to time-of-flight super resolution, based on the combination of depth and texture sources. Unlike other texture guided approaches, we interpret the depth upscaling process as a weighted energy optimization problem. Three different weights are introduced, employing different available sensor data. The individual weights address object boundaries in depth, depth sensor noise, and temporal consistency. Applied in consecutive order, they form three weighting strategies for time-of-flight super resolution. Objective evaluations show advantages in depth accuracy and for depth image based rendering compared with state-of-the-art depth upscaling. Subjective view synthesis evaluation shows a significant increase in viewer preference by a factor of four in stereoscopic viewing conditions. To the best of our knowledge, this is the first extensive subjective test performed on time-of-flight depth upscaling. Objective and subjective results proof the suitability of our approach to time-of-flight super resolution approach for depth scenery capture.

  • 36.
    Schwarz, Sebastian
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Improved edge detection for EWOC depth upscaling2012Inngår i: 2012 19th International Conference on Systems, Signals and Image Processing, IWSSIP 2012, IEEE conference proceedings, 2012, s. 1-4Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The need for accurate depth information in three-dimen-sional television (3DTV) encourages the use of range sensors,i.e. time-of-flight (ToF) cameras. Since these sensors provideonly limited spatial resolution compared to modern high res-olution image sensors, upscaling methods are much needed.Typical depth upscaling algorithms fuse low resolution depthinformation with appropriate high resolution texture frames,taking advantage of the additional texture information in theupscaling process. We recently introduced a promising up-scaling method, utilizing edge information from the textureframe to upscale low resolution depthmaps. This paper exam-ines how a more thorough edge detection can be achieved byinvestigating different edge detection sources, such as inten-sity, color spaces and difference signals. Our findings showthat a combination of sources based on the perceptual quali-ties of the human visual system (HVS) leads to slightly im-proved results. On the other hand these improvements implya more complex edge detection.

  • 37.
    Schwarz, Sebastian
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Incremental depth upscaling using an edge weighted optimization concept2012Inngår i: 3DTV-Conference, 2012, s. Art. no. 6365429-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Precise scene depth information is a pre-requisite in three-dimen-sional television (3DTV), e.g. for high quality view synthesis inautostereoscopic multiview displays. Unfortunately, this informa-tion is not easily obtained and often of limited quality. Dedicatedrangesensors, suchastime-of-flight(ToF)cameras, candeliverre-liable depth information where (stereo-)matching fails. Nonethe-less, since these sensors provide only restricted spatial resolution,sophisticated upscaling methods are sought-after, to match depthinformation to corresponding texture frames. Where traditionalupscaling fails, novel approaches have been proposed, utilizingadditional information from the texture for the depth upscalingprocess. We recently proposed the Edge Weighted OptimizationConcept (EWOC) for ToF upscaling, using texture edges for ac-curate depth boundaries. In this paper we propose an importantupdate to EWOC, dividing it into smaller incremental upscalingsteps. We predict two major improvements from this. Firstly, pro-cessing time should be decreased by dividing one big calculationinto several smaller steps. Secondly, we assume an increase inquality for the upscaled depth map, due to a more coherent edgedetection on the video frame. In our evaluations we can showthe desired effect on processing time, cutting down the calculationtime more than in half. We can also show an increase in visualquality, based on objective quality metrics, compared to the origi-nal implementation as well as competing proposals.

  • 38.
    Schwarz, Sebastian
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Multivariate Sensitivity Analysis of Time-of-Flight Sensor Fusion2014Inngår i: 3D Research, ISSN 2092-6731, Vol. 5, nr 3Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    Obtaining three-dimensional scenery data is an essential task in computer vision, with diverse applications in various areas such as manufacturing and quality control, security and surveillance, or user interaction and entertainment. Dedicated Time-of-Flight sensors can provide detailed scenery depth in real-time and overcome short-comings of traditional stereo analysis. Nonetheless, they do not provide texture information and have limited spatial resolution. Therefore such sensors are typically combined with high resolution video sensors. Time-of-Flight Sensor Fusion is a highly active field of research. Over the recent years, there have been multiple proposals addressing important topics such as texture-guided depth upsampling and depth data denoising. In this article we take a step back and look at the underlying principles of ToF sensor fusion. We derive the ToF sensor fusion error model and evaluate its sensitivity to inaccuracies in camera calibration and depth measurements. In accordance with our findings, we propose certain courses of action to ensure high quality fusion results. With this multivariate sensitivity analysis of the ToF sensor fusion model, we provide an important guideline for designing, calibrating and running a sophisticated Time-of-Flight sensor fusion capture systems.

  • 39.
    Schwarz, Sebastian
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Temporal Consistent Depth Map Upscaling for 3DTV2014Inngår i: Proceedings of SPIE - The International Society for Optical Engineering: Three-Dimensional Image Processing, Measurement (3DIPM), and Applications 2014 / [ed] Atilla M. Baskurt; Robert Sitnik, 2014, s. Art. no. 901302-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Precise scene depth information is a pre-requisite in three-dimen-sional television (3DTV), e.g. for high quality view synthesis in autostereoscopic multiview displays. Unfortunately, this information is not easily obtained and often of limited quality. Dedicated range sensors, such as time-of-flight (ToF) cameras, can deliver reliable depth information where (stereo-)matching fails. Nonetheless, since these sensors provide only restricted spatial resolution, sophisticated upscaling methods are sought-after, to match depth information to corresponding texture frames.Where traditional upscaling fails, novel approaches have been proposed, utilizing additional information from the texture for the depth upscaling process. We recently proposed the Edge Weighted Optimization Concept (EWOC) for ToF upscaling, using texture edges for accurate depth boundaries. In this paper we propose an important update to EWOC, dividing it into smaller incremental upscaling steps. We predict two major improvements from this. Firstly, processing time should be decreased by dividing one big calculation into several smaller steps. Secondly, we assume an increase in quality for the upscaled depth map, due to a more coherent edge detection on the video frame.In our evaluations we can show the desired effect on processing time, cutting down the calculation time more than in half. We can also show an increase in visual quality, based on objective quality metrics, compared to the original implementation as well as competing proposals.

  • 40.
    Schwarz, Sebastian
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Sjöström, Mårten
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för informations- och kommunikationssystem.
    Time-of-Flight Sensor Fusion with Depth Measurement Reliability Weighting2014Inngår i: 3DTV-Conference, IEEE Computer Society, 2014, s. Art. no. 6874759-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Accurate scene depth capture is essential for the success of three-dimensional television (3DTV), e.g. for high quality view synthesis in autostereoscopic multiview displays. Unfortunately, scene depth is not easily obtained and often of limited quality. Dedicated Time-of-Flight (ToF) sensors can deliver reliable depth readings where traditional methods, such as stereovision analysis, fail. However, since ToF sensors provide only limited spatial resolution and suffer from sensor noise, sophisticated upsampling methods are sought after. A multitude of ToF solutions have been proposed over the recent years. Most of them achieve ToF super-resolution (TSR) by sensor fusion between ToF and additional sources, e.g. video. We recently proposed a weighted error energy minimization approach for ToF super-resolution, incorporating texture, sensor noise and temporal information. For this article, we take a closer look at the sensor noise weighting related to the Time-of-Flight active brightness signal. We determine a depth measurement reliability function based on optimizing free parameters to test data and verifying it with independent test cases. In the presented doubleweighted TSR proposal, depth readings are weighted into the upsampling process with regard to their reliability, removing erroneous influences in the final result. Our evaluations prove the desired effect of depth measurement reliability weighting, decreasing the depth upsampling error by almost 40% in comparison to competing proposals.

  • 41.
    Shallari, Irida
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för elektronikkonstruktion.
    Krug, Silvia
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för elektronikkonstruktion.
    O'Nils, Mattias
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Avdelningen för elektronikkonstruktion.
    Architectural evaluation of node: server partitioning for people counting2018Inngår i: ACM International Conference Proceeding Series, New York: ACM Digital Library, 2018, artikkel-id Article No. 1Konferansepaper (Fagfellevurdert)
    Abstract [en]

    The Internet of Things has changed the range of applications for cameras requiring them to be easily deployed for a variety of scenarios indoor and outdoor, while achieving high performance in processing. As a result, future projections emphasise the need for battery operated smart cameras, capable of complex image processing tasks that also communicate within one another, and the server. Based on these considerations, we evaluate in-node and node – server configurations of image processing tasks to provide an insight of how tasks partitioning affects the overall energy consumption. The two main energy components taken in consideration for their influence in the total energy consumption are processing and communication energy. The results from the people counting scenario proved that processing background modelling, subtraction and segmentation in-node while transferring the remaining tasks to the server results in the most energy efficient configuration, optimising both processing and communication energy. In addition, the inclusion of data reduction techniques such as data aggregation and compression not always resulted in lower energy consumption as generally assumed, and the final optimal partition did not include data reduction.

  • 42.
    Sjöström, Mårten
    et al.
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Tourancheau, Sylvain
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Wang, Xusheng
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    Olsson, Roger
    Mittuniversitetet, Fakulteten för naturvetenskap, teknik och medier, Institutionen för informationsteknologi och medier.
    A locally content-dependent filter for inter-perspective anti-aliasing2012Inngår i: Proceedings of SPIE - The International Society for Optical Engineering, SPIE - International Society for Optical Engineering, 2012, s. Art. no. 829006-Konferansepaper (Fagfellevurdert)
    Abstract [en]

    Presentations on multiview and lightfield displays have become increasingly popular. The restricted number of views implies an unsmooth transition between views if objects with sharp edges are far from the display plane. The phenomenon is explained by inter-perspective aliasing. This is undesirable in applications where a correct perception of the scene is required, such as in science and medicine. Anti-aliasing filters have been proposed in the literature, and are defined according to the minimum and maximum depth present in the scene.  We suggest a method that subdivides the ray-space and adjusts the anti-aliasing filter to the scene contents locally. We further propose new filter kernels based on the ray space frequency domain that assures no aliasing, yet keeping maximum information unaltered. The proposed method outperforms filters of earlier works. Different filter kernels are compared. Details of the output are sharper using a proposed filter kernel, which also preserves the most information.

  • 43.
    Stöggl, Thomas
    et al.
    Mittuniversitetet, Fakulteten för humanvetenskap, Avdelningen för hälsovetenskap. Department of Sport Science and Kinesiology, University of SalzburgHallein/Rif, Austria .
    Holst, Anders
    School of Computer Science and Communication, Royal Institute of Technology, Stockholm, Sweden .
    Jonasson, Arndt
    Swedish Institute of Computer Science, Kista, Sweden.
    Andersson, Erik
    Mittuniversitetet, Fakulteten för humanvetenskap, Avdelningen för hälsovetenskap.
    Wunsch, Thomas
    Department of Sport Science and Kinesiology, University of SalzburgHallein/Rif, Austria .
    Norström, Christer
    Swedish Institute of Computer Science, Kista, Sweden .
    Holmberg, Hans-Christer
    Mittuniversitetet, Fakulteten för humanvetenskap, Avdelningen för hälsovetenskap. Swedish Olympic Committee, Stockholm, Sweden .
    Automatic classification of the sub-techniques (gears) used in cross-country ski skating employing a mobile phone2014Inngår i: Sensors, ISSN 1424-8220, E-ISSN 1424-8220, Vol. 14, nr 11, s. 20589-20601Artikkel i tidsskrift (Fagfellevurdert)
    Abstract [en]

    The purpose of the current study was to develop and validate an automatic algorithm for classification of cross-country (XC) ski-skating gears (G) using Smartphone accelerometer data. Eleven XC skiers (seven men, four women) with regional-to-international levels of performance carried out roller skiing trials on a treadmill using fixed gears (G2left, G2right, G3, G4left, G4right) and a 950-m trial using different speeds and inclines, applying gears and sides as they normally would. Gear classification by the Smartphone (on the chest) and based on video recordings were compared. Formachine-learning, a collective database was compared to individual data. The Smartphone application identified the trials with fixed gears correctly in all cases. In the 950-m trial, participants executed 140 ± 22 cycles as assessed by video analysis, with the automatic Smartphone application giving a similar value. Based on collective data, gears were identified correctly 86.0% ± 8.9% of the time, a value that rose to 90.3% ± 4.1% (P < 0.01) with machine learning from individual data. Classification was most often incorrect during transition between gears, especially to or from G3. Identification was most often correct for skiers who made relatively few transitions between gears. The accuracy of the automatic procedure for identifying G2left, G2right, G3, G4left and G4right was 96%, 90%, 81%, 88% and 94%, respectively. The algorithm identified gears correctly 100% of the time when a single gear was used and 90% of the time when different gears were employed during a variable protocol. This algorithm could be improved with respect to identification of transitions between gears or the side employed within a given gear.

1 - 43 of 43
RefereraExporteraLink til resultatlisten
Permanent link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf