Mid Sweden University

miun.sePublications
Change search
Refine search result
123 1 - 50 of 146
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ahmad, Waqas
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Ghafoor, Mubeen
    COMSATS University Islamabad, Pakistan.
    Tariq, Syed Ali
    COMSATS University Islamabad, Pakistan.
    Hassan, Ali
    COMSATS University Islamabad, Pakistan.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Computationally Efficient Light Field Image Compression Using a Multiview HEVC Framework2019In: IEEE Access, E-ISSN 2169-3536, Vol. 7, p. 143002-143014, article id 8853251Article in journal (Refereed)
    Abstract [en]

    The acquisition of the spatial and angular information of a scene using light eld (LF) technologies supplement a wide range of post-processing applications, such as scene reconstruction, refocusing, virtual view synthesis, and so forth. The additional angular information possessed by LF data increases the size of the overall data captured while offering the same spatial resolution. The main contributor to the size of captured data (i.e., angular information) contains a high correlation that is exploited by state-of-the-art video encoders by treating the LF as a pseudo video sequence (PVS). The interpretation of LF as a single PVS restricts the encoding scheme to only utilize a single-dimensional angular correlation present in the LF data. In this paper, we present an LF compression framework that efciently exploits the spatial and angular correlation using a multiview extension of high-efciency video coding (MV-HEVC). The input LF views are converted into multiple PVSs and are organized hierarchically. The rate-allocation scheme takes into account the assigned organization of frames and distributes quality/bits among them accordingly. Subsequently, the reference picture selection scheme prioritizes the reference frames based on the assigned quality. The proposed compression scheme is evaluated by following the common test conditions set by JPEG Pleno. The proposed scheme performs 0.75 dB better compared to state-of-the-art compression schemes and 2.5 dB better compared to the x265-based JPEG Pleno anchor scheme. Moreover, an optimized motionsearch scheme is proposed in the framework that reduces the computational complexity (in terms of the sum of absolute difference [SAD] computations) of motion estimation by up to 87% with a negligible loss in visual quality (approximately 0.05 dB).

    Download full text (pdf)
    fulltext
  • 2.
    Ahmad, Waqas
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Interpreting Plenoptic Images as Multi-View Sequences for Improved Compression2017Data set
    Abstract [en]

    The paper is written in the response to ICIP 2017, Grand challenge on plenoptic image compression. The input image format and compression rates set out by the competition are followed to estimate the results.

    Download full text (pdf)
    data set
  • 3.
    Ahmad, Waqas
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Interpreting Plenoptic Images as Multi-View Sequences for Improved Compression2017In: ICIP 2017, IEEE, 2017, p. 4557-4561Conference paper (Refereed)
    Abstract [en]

    Over the last decade, advancements in optical devices have made it possible for new novel image acquisition technologies to appear. Angular information for each spatial point is acquired in addition to the spatial information of the scene that enables 3D scene reconstruction and various post-processing effects. Current generation of plenoptic cameras spatially multiplex the angular information, which implies an increase in image resolution to retain the level of spatial information gathered by conventional cameras. In this work, the resulting plenoptic image is interpreted as a multi-view sequence that is efficiently compressed using the multi-view extension of high efficiency video coding (MV-HEVC). A novel two dimensional weighted prediction and rate allocation scheme is proposed to adopt the HEVC compression structure to the plenoptic image properties. The proposed coding approach is a response to ICIP 2017 Grand Challenge: Light field Image Coding. The proposed scheme outperforms all ICME contestants, and improves on the JPEG-anchor of ICME with an average PSNR gain of 7.5 dB and the HEVC-anchor of ICIP 2017 Grand Challenge with an average PSNR gain of 2.4 dB.

    Download full text (pdf)
    fulltext
  • 4.
    Ahmad, Waqas
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Towards a generic compression solution for densely and sparsely sampled light field data2018In: Proceedings of 25TH IEEE International Conference On Image Processing, 2018, p. 654-658, article id 8451051Conference paper (Refereed)
    Abstract [en]

    Light field (LF) acquisition technologies capture the spatial and angular information present in scenes. The angular information paves the way for various post-processing applications such as scene reconstruction, refocusing, and synthetic aperture. The light field is usually captured by a single plenoptic camera or by multiple traditional cameras. The former captures a dense LF, while the latter captures a sparse LF. This paper presents a generic compression scheme that efficiently compresses both densely and sparsely sampled LFs. A plenoptic image is converted into sub-aperture images, and each sub-aperture image is interpreted as a frame of a multiview sequence. In comparison, each view of the multi-camera system is treated as a frame of a multi-view sequence. The multi-view extension of high efficiency video coding (MVHEVC) is used to encode the pseudo multi-view sequence.This paper proposes an adaptive prediction and rate allocation scheme that efficiently compresses LF data irrespective of the acquisition technology used.

    Download full text (pdf)
    fulltext
  • 5.
    Ahmad, Waqas
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Palmieri, Luca
    Christian-Albrechts-Universität, Kiel, Germany.
    Koch, Reinhard
    Christian-Albrechts-Universität, Kiel, Germany.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Matching Light Field Datasets From Plenoptic Cameras 1.0 And 2.02018In: Proceedings of the 2018 3DTV Conference, 2018, article id 8478611Conference paper (Refereed)
    Abstract [en]

    The capturing of angular and spatial information of the scene using single camera is made possible by new emerging technology referred to as plenoptic camera. Both angular and spatial information, enable various post-processing applications, e.g. refocusing, synthetic aperture, super-resolution, and 3D scene reconstruction. In the past, multiple traditional cameras were used to capture the angular and spatial information of the scene. However, recently with the advancement in optical technology, plenoptic cameras have been introduced to capture the scene information. In a plenoptic camera, a lenslet array is placed between the main lens and the image sensor that allows multiplexing of the spatial and angular information onto a single image, also referred to as plenoptic image. The placement of the lenslet array relative to the main lens and the image sensor, results in two different optical design sof a plenoptic camera, also referred to as plenoptic 1.0 and plenoptic 2.0. In this work, we present a novel dataset captured with plenoptic 1.0 (Lytro Illum) and plenoptic 2.0(Raytrix R29) cameras for the same scenes under the same conditions. The dataset provides the benchmark contents for various research and development activities for plenoptic images.

    Download full text (pdf)
    fulltext
  • 6.
    Ahmad, Waqas
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Palmieri, Luca
    University of Padova, Italy.
    Koch, Reinhard
    Christian-Albrechts-University of Kiel, Germany.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    The Plenoptic Dataset2018Data set
    Abstract [en]

    The dataset is captured using two different plenoptic cameras, namely Illum from Lytro (based on plenoptic 1.0 model) and R29 from Raytrix (based on plenoptic 2.0 model). The scenes selected for the dataset were captured under controlled conditions. The cameras were mounted onto a multi-camera rig that was mechanically controlled to move the cameras with millimeter precision. In this way, both cameras captured the scene from the same viewpoint.

  • 7.
    Ahmad, Waqas
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Compression scheme for sparsely sampled light field data based on pseudo multi-view sequences2018In: OPTICS, PHOTONICS, AND DIGITAL TECHNOLOGIES FOR IMAGING APPLICATIONS V Proceedings of SPIE - The International Society for Optical Engineering, SPIE - International Society for Optical Engineering, 2018, Vol. 10679, article id 106790MConference paper (Refereed)
    Abstract [en]

    With the advent of light field acquisition technologies, the captured information of the scene is enriched by having both angular and spatial information. The captured information provides additional capabilities in the post processing stage, e.g. refocusing, 3D scene reconstruction, synthetic aperture etc. Light field capturing devices are classified in two categories. In the first category, a single plenoptic camera is used to capture a densely sampled light field, and in second category, multiple traditional cameras are used to capture a sparsely sampled light field. In both cases, the size of captured data increases with the additional angular information. The recent call for proposal related to compression of light field data by JPEG, also called “JPEG Pleno”, reflects the need of a new and efficient light field compression solution. In this paper, we propose a compression solution for sparsely sampled light field data. In a multi-camera system, each view depicts the scene from a single perspective. We propose to interpret each single view as a frame of pseudo video sequence. In this way, complete MxN views of multi-camera system are treated as M pseudo video sequences, where each pseudo video sequence contains N frames. The central pseudo video sequence is taken as base View and first frame in all the pseudo video sequences is taken as base Picture Order Count (POC). The frame contained in base view and base POC is labeled as base frame. The remaining frames are divided into three predictor levels. Frames placed in each successive level can take prediction from previously encoded frames. However, the frames assigned with last prediction level are not used for prediction of other frames. Moreover, the rate-allocation for each frame is performed by taking into account its predictor level, its frame distance and view wise decoding distance relative to the base frame. The multi-view extension of high efficiency video coding (MV-HEVC) is used to compress the pseudo multi-view sequences. The MV-HEVC compression standard enables the frames to take prediction in both direction (horizontal and vertical d), and MV-HEVC parameters are used to implement the proposed 2D prediction and rate allocation scheme. A subset of four light field images from Stanford dataset are compressed, using the proposed compression scheme on four bitrates in order to cover the low to high bit-rates scenarios. The comparison is made with state-of-art reference encoder HEVC and its real-time implementation X265. The 17x17 grid is converted into a single pseudo sequence of 289 frames by following the order explained in JPEG Pleno call for proposal and given as input to the both reference schemes. The rate distortion analysis shows that the proposed compression scheme outperforms both reference schemes in all tested bitrate scenarios for all test images. The average BD-PSNR gain is 1.36 dB over HEVC and 2.15 dB over X265.

    Download full text (pdf)
    fulltext
  • 8.
    Ahmad, Waqas
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Vagharshakyan, Suren
    Tampere University of Technology, Finland.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Gotchev, Atanas
    Tampere University of Technology, Finland.
    Bregovic, Robert
    Tampere University of Technology, Finland.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Shearlet Transform Based Prediction Scheme for Light Field Compression2018Conference paper (Refereed)
    Abstract [en]

    Light field acquisition technologies capture angular and spatial information ofthe scene. The spatial and angular information enables various post processingapplications, e.g. 3D scene reconstruction, refocusing, synthetic aperture etc at theexpense of an increased data size. In this paper, we present a novel prediction tool forcompression of light field data acquired with multiple camera system. The captured lightfield (LF) can be described using two plane parametrization as, L(u, v, s, t), where (u, v)represents each view image plane coordinates and (s, t) represents the coordinates of thecapturing plane. In the proposed scheme, the captured LF is uniformly decimated by afactor d in both directions (in s and t coordinates), resulting in a sparse set of views alsoreferred to as key views. The key views are converted into a pseudo video sequence andcompressed using high efficiency video coding (HEVC). The shearlet transform basedreconstruction approach, presented in [1], is used at the decoder side to predict thedecimated views with the help of the key views.Four LF images (Truck, Bunny from Stanford dataset, Set2 and Set9 from High DensityCamera Array dataset) are used in the experiments. Input LF views are converted into apseudo video sequence and compressed with HEVC to serve as anchor. Rate distortionanalysis shows the average PSNR gain of 0.98 dB over the anchor scheme. Moreover, inlow bit-rates, the compression efficiency of the proposed scheme is higher compared tothe anchor and on the other hand the performance of the anchor is better in high bit-rates.Different compression response of the proposed and anchor scheme is a consequence oftheir utilization of input information. In the high bit-rate scenario, high quality residualinformation enables the anchor to achieve efficient compression. On the contrary, theshearlet transform relies on key views to predict the decimated views withoutincorporating residual information. Hence, it has inherit reconstruction error. In the lowbit-rate scenario, the bit budget of the proposed compression scheme allows the encoderto achieve high quality for the key views. The HEVC anchor scheme distributes the samebit budget among all the input LF views that results in degradation of the overall visualquality. The sensitivity of human vision system toward compression artifacts in low-bitratecases favours the proposed compression scheme over the anchor scheme.

    Download full text (pdf)
    fulltext
    Download full text (pdf)
    fulltext
  • 9.
    Ahmad, Waqas
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Vagharshakyan, Suren
    Tampere Univ Technol, Korkeakoulunkatu 10, Tampere 33720, Finland..
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Gotchev, Atanas
    Tampere Univ Technol, Korkeakoulunkatu 10, Tampere 33720, Finland..
    Bregovic, Robert
    Tampere Univ Technol, Korkeakoulunkatu 10, Tampere 33720, Finland..
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Shearlet Transform Based Prediction Scheme for Light Field Compression2018In: 2018 DATA COMPRESSION CONFERENCE (DCC 2018) / [ed] Bilgin, A Marcellin, MW SerraSagrista, J Storer, JA, IEEE, 2018, p. 396-396Conference paper (Refereed)
  • 10.
    Ahmad, Waqas
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Vagharshakyan, Suren
    Tampere University, Tampere, Finland.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Gotchev, Atanas
    Tampere University, Tampere, Finland.
    Bregovic, Robert
    Tampere University, Tampere, Finland.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Shearlet Transform-Based Light Field Compression under Low Bitrates2020In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 29, p. 4269-4280, article id 8974608Article in journal (Refereed)
    Abstract [en]

    Light field (LF) acquisition devices capture spatial and angular information of a scene. In contrast with traditional cameras, the additional angular information enables novel post-processing applications, such as 3D scene reconstruction, the ability to refocus at different depth planes, and synthetic aperture. In this paper, we present a novel compression scheme for LF data captured using multiple traditional cameras. The input LF views were divided into two groups: key views and decimated views. The key views were compressed using the multi-view extension of high-efficiency video coding (MV-HEVC) scheme, and decimated views were predicted using the shearlet-transform-based prediction (STBP) scheme. Additionally, the residual information of predicted views was also encoded and sent along with the coded stream of key views. The proposed scheme was evaluated over a benchmark multi-camera based LF datasets, demonstrating that incorporating the residual information into the compression scheme increased the overall peak signal to noise ratio (PSNR) by 2 dB. The proposed compression scheme performed significantly better at low bit rates compared to anchor schemes, which have a better level of compression efficiency in high bit-rate scenarios. The sensitivity of the human vision system towards compression artifacts, specifically at low bit rates, favors the proposed compression scheme over anchor schemes. The proposed compression scheme performed significantly better at low bit rates compared to anchor schemes, which have a better level of compression efficiency in high bit-rate scenarios. The sensitivity of the human vision system towards compression artifacts, specifically at low bit rates, favors the proposed compression scheme over anchor schemes. The proposed compression scheme performed significantly better at low bit rates compared to anchor schemes, which have a better level of compression efficiency in high bit-rate scenarios. The sensitivity of the human vision system towards compression artifacts, specifically at low bit rates, favors the proposed compression scheme over anchor schemes. 

    Download full text (pdf)
    fulltext
  • 11. Blas, A.
    et al.
    Hancock, S.
    Koscielniak, S.
    Lindroos, M.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Evaluation of vector signal analyzer for beam transfer function measurements in PS Booster1999Report (Other scientific)
  • 12.
    Boström, Lena
    et al.
    Mid Sweden University, Faculty of Human Sciences, Department of Education.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Designing and Evaluating an Interactive Learning Resource for Scientific Methods: Visual Learning Support and Visualization of Research Process Structure2021Conference paper (Refereed)
  • 13.
    Boström, Lena
    et al.
    Mid Sweden University, Faculty of Human Sciences, Department of Education.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    MethodViz: designing and evaluating an interactive learning tool for scientific methods – visual learning support and visualization of research process structure2022In: Education and Information Technologies: Official Journal of the IFIP technical committee on Education, ISSN 1360-2357, E-ISSN 1573-7608, Vol. 27, no 9, p. 12793-12810Article in journal (Refereed)
    Abstract [en]

    In this study, we focussed on designing and evaluating a learning tool for the research process in higher education. Mastering the research process seems to be a bottleneck within the academy. Therefore, there is a great need to offer students other ways to learn this skill in addition to books and lectures. The MethodViz tool supports ubiquitous aspects of the research process in their scientific works higher education students follow. Moreover, the tool facilitates and structures the process interactively. In this paper, we describe the creation process of the artefact and examine the characteristics and scope of MethodViz alongside the traits and ideas of design science research. The evaluation’s results are encouraging and show that MethodViz has the potential to improve students’ learning achievements.

  • 14.
    Boström, Lena
    et al.
    Mid Sweden University, Faculty of Human Sciences, Department of Education.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Karlsson, Håkan
    Mid Sweden University, Faculty of Human Sciences, Department of Education.
    Sundgren, Marcus
    Mid Sweden University, Faculty of Human Sciences, Department of Education.
    Andersson, Mattias
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Åhlander, Jimmy
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Digital visualisering i skolan: Mittuniversitetets slutrapport från förstudien2018Report (Other academic)
    Abstract [sv]

    Den här studiens syfte har varit tvåfaldigt, nämligen att testa alternativa lärmetoder via ett digitalt läromedel i matematik i en kvasiexperimentell studie samt att tillämpa metoder av användarupplevelser för interaktiva visualiseringar, och därigenom öka kunskapen kring hur upplevd kvalitet beror på använd teknik. Pilotstudien sätter också fokus på flera angelägna områden inom skolutveckling både regionalt och nationellt samt viktiga aspekter när det gäller kopplingen teknik, pedagogik och utvärderingsmetoder inom “den tekniska delen”. Det förra handlar om sjunkande matematikresultat i skolan, praktiknära skolforskning, stärkt digital kompetens, visualisering och lärande samt forskning om visualisering och utvärdering. Den senare svarar på frågor om vilka tekniska lösningar som tidigare använts och med vilket syfte har de skapats samt hur visualiseringar har utvärderats enligt läroböcker och i forskningslitteratur.

     

    När det gäller elevernas resultat, en av de stora forskningsfrågorna i studien, så fann vi inga signifikanta skillnader mellan traditionell undervisning och undervisning med visualiseringsläromedlet (3D). Beträffande elevers attityder till matematikmomentet kan konstateras att i kontrollgruppen för årskurs 6 förbättrades attityden signifikans, men inte i klass 8. Gällande flickors och pojkars resultat och attityder kan vi konstatera att flickorna i båda klasserna hade bättre förkunskaper än pojkarna samt att i årskurs 6 var flickorna mer positiva till matematikmomentet än pojkarna i kontrollgruppen. Därutöver kan vi inte skönja några signifikanta skillnader. Andra viktiga rön i studien var att provkonstruktionen inte var optimal samt att tiden för provgenomförande har stor betydelse när på dagen det genomfördes. Andra resultat resultaten i den kvalitativa analysen pekar på positiva attityder och beteenden från eleverna vid arbetet med det visuella läromedlet. Elevernas samarbete och kommunikation förbättrades under lektionerna. Vidare pekade lärarna på att med 3D-läromedlet gavs större möjligheter till att stimulera flera sinnen under lärprocessen. En tydlig slutsats är att 3D-läromedlet är ett viktigt komplement i undervisningen, men kan inte användas helt självt.

     

    Vi kan varken sälla oss till de forskare som anser att 3D-visualisering är överlägset som läromedel för elevers resultat eller till de forskare som varnar för dess effekter för elevers kognitiva överbelastning.  Våra resultat ligger mer i linje med de slutsatser Skolforskningsinstitutet (2017) drar, nämligen att undervisning med digitala läromedel i matematik kan ha positiva effekter, men en lika effektiv undervisning kan möjligen designas på andra sätt. Däremot pekar resultaten i vår studie på ett flertal störningsmoment som kan ha påverkat möjliga resultat och behovet av god teknologin och välutvecklade programvaror.

     

    I studien har vi analyserat resultaten med hjälp av två övergripande ramverk för integrering av teknikstöd i lärande, SAMR och TPACK. Det förra ramverket bidrog med en taxonomi vid diskussionen av hur väl teknikens möjligheter tagits tillvara av läromedel och i läraktiviteter, det senare för en diskussion om de didaktiska frågeställningarna med fokus på teknikens roll. Båda aspekterna är högaktuella med tanke på den ökande digitaliseringen i skolan.

     

    Utifrån tidigare forskning och denna pilotstudie förstår vi att det är viktigt att designa forskningsmetoderna noggrant. En randomisering av grupper vore önskvärt. Prestandamått kan också vara svåra att välja. Tester där personer får utvärdera användbarhet (usability) och användarupplevelse (user experience, UX) baserade på både kvalitativa och kvantitativa metoder blir viktiga för själva användandet av tekniken, men det måste till ytterligare utvärderingar för att koppla tekniken och visualiseringen till kvaliteten i lärandet och undervisningen. Flera metoder behövs således och det blir viktigt med samarbete mellan olika ämnen och discipliner.

    Download full text (pdf)
    fulltext
  • 15.
    Brunnström, Kjell
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology. RISE Research Institute of Sweden AB.
    Dima, Elijs
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Andersson, Mattias
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Qureshi, Tahir
    HIAB.
    Johanson, Mathias
    Alkit Communications AB.
    Quality of Experience of hand controller latency in a Virtual Reality simulator2019In: Human Vision and Electronic Imaging 2019 / [ed] Damon Chandler, Mark McCourt and Jeffrey Mulligan, 2019, Springfield, VA, United States, 2019, article id 3068450Conference paper (Refereed)
    Abstract [en]

    In this study, we investigate a VR simulator of a forestry crane used for loading logs onto a truck, mainly looking at Quality of Experience (QoE) aspects that may be relevant for task completion, but also whether there are any discomfort related symptoms experienced during task execution. A QoE test has been designed to capture both the general subjective experience of using the simulator and to study task performance. Moreover, a specific focus has been to study the effects of latency on the subjective experience, with regards to delays in the crane control interface. A formal subjective study has been performed where we have added controlled delays to the hand controller (joystick) signals. The added delays ranged from 0 ms to 800 ms. We found no significant effects of delays on the task performance on any scales up to 200 ms. A significant negative effect was found for 800 ms added delay. The Symptoms reported in the Simulator Sickness Questionnaire (SSQ) was significantly higher for all the symptom groups, but a majority of the participants reported only slight symptoms. Two out of thirty test persons stopped the test before finishing due to their symptoms.

    Download full text (pdf)
    fulltext
  • 16.
    Brunnström, Kjell
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology. Research Intstitutes of Sweden AB.
    Dima, Elijs
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Qureshi, Tahir
    HIAB AB, Hudiksvall, Sweden.
    Johanson, Mathias
    Alkit Communications AB, Mölndal, Sweden.
    Andersson, Mattias
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Latency impact on Quality of Experience in a virtual reality simulator for remote control of machines2020In: Signal processing: Image communication, ISSN 0923-5965, Vol. 89, no Nov, article id 116005Article in journal (Refereed)
    Abstract [en]

    In this article, we have investigated a VR simulator of a forestry crane used for loading logs onto a truck. We have mainly studied the Quality of Experience (QoE) aspects that may be relevant for task completion, and whether there are any discomfort related symptoms experienced during the task execution. QoE experiments were designed to capture the general subjective experience of using the simulator, and to study task performance. The focus was to study the effects of latency on the subjective experience, with regards to delays in the crane control interface. Subjective studies were performed with controlled delays added to the display update and hand controller (joystick) signals. The added delays ranged from 0 to 30 ms for the display update, and from 0 to 800 ms for the hand controller. We found a strong effect on latency in the display update and a significant negative effect for 800 ms added delay on latency in the hand controller (in total approx. 880 ms latency including the system delay). The Simulator Sickness Questionnaire (SSQ) gave significantly higher scores after the experiment compared to before the experiment, but a majority of the participants reported experiencing only minor symptoms. Some test subjects ceased the test before finishing due to their symptoms, particularly due to the added latency in the display update.

    Download full text (pdf)
    fulltext
  • 17.
    Brunnström, Kjell
    et al.
    Acreo AB, Kista, Sweden.
    Sedano, Iñigo
    Tecnalia Research & Innovation, Bilbao, Spain.
    Wang, Kun
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Barkowsky, Markus
    IRCCyN, Nantes; France.
    Kihl, Maria
    Lund University.
    Andrén, Börje
    Acreo AB, Kista, Sweden.
    Le Callet, Patrick
    IRCCyN, Nantes; France.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Aurelius, Andreas
    Acreo AB, Kista, Sweden.
    2D no-reference video quality model development and 3D video transmission quality2012In: Proceedings of the Sixth International Workshop on Video Processing and Quality Metrics for Consumer Electronics VPQM-2012, 2012Conference paper (Other academic)
    Abstract [en]

    This presentation will target two different topics in video quality assessment. First, we discuss 2D no-reference video quality model development. Further, we discuss how to find suitable quality for 3D video transmission. No-reference metrics are the only practical option for monitoring of 2D video quality in live networks. In order to decrease the development time, it might be possible to use full-reference metrics for this purpose. In this work, we have evaluated six full-reference objective metrics in three different databases. We show statistically that VQM performs the best. Further, we use these results to develop a lightweight no-reference model. We have also investigated users' experience of stereoscopic 3D video quality by performing the rating of two subjective assessment datasets, targeting in one dataset efficient transmission in the transmission error free case and error concealment in the other. Among other results, it was shown that, based on the same level of quality of experience, spatial down-sampling may lead to better bitrate efficiency while temporal down-sampling will be worse. When network impairments occur, traditional error 2D concealment methods need to be reinvestigated as they were outperformed switching to 2D presentation.

    Download full text (pdf)
    fulltext
  • 18.
    Brunnström, Kjell
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology. RISE Acreo AB.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Imran, Muhammad
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design. HIAB AB.
    Pettersson, Magnus
    HIAB AB.
    Johanson, Mathias
    Alkit Communications AB, Mölndal.
    Quality Of Experience For A Virtual Reality Simulator2018In: IS and T International Symposium on Electronic Imaging Science and Technology 2018, 2018Conference paper (Refereed)
    Abstract [en]

    In this study, we investigate a VR simulator of a forestrycrane used for loading logs onto a truck, mainly looking at Qualityof Experience (QoE) aspects that may be relevant for taskcompletion, but also whether there are any discomfort relatedsymptoms experienced during task execution. The QoE test hasbeen designed to capture both the general subjective experience ofusing the simulator and to study task completion rate. Moreover, aspecific focus has been to study the effects of latency on thesubjective experience, with regards both to delays in the cranecontrol interface as well as lag in the visual scene rendering in thehead mounted display (HMD). Two larger formal subjectivestudies have been performed: one with the VR-system as it is andone where we have added controlled delay to the display updateand to the joystick signals. The baseline study shows that mostpeople are more or less happy with the VR-system and that it doesnot have strong effects on any symptoms as listed in the SSQ. In thedelay study we found significant effects on Comfort Quality andImmersion Quality for higher Display delay (30 ms), but verysmall impact of joystick delay. Furthermore, the Display delay hadstrong influence on the symptoms in the SSQ, as well as causingtest subjects to decide not to continue with the completeexperiments, and this was also found to be connected to the longerDisplay delays (≥ 20 ms).

    Download full text (pdf)
    fulltext
  • 19.
    Conti, Caroline
    et al.
    University of Lisbon, Portugal.
    Soares, Luis Ducla
    University of Lisbon, Portugal.
    Nunes, Paulo
    University of Lisbon, Portugal.
    Perra, Cristian
    University of Cagliari, Italy.
    Assunção, Pedro Amado
    Institute de Telecomunicacoes and Politecenico de Leiria, Portugal.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Li, Yun
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Jennehag, Ulf
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Light Field Image Compression2018In: 3D Visual Content Creation, Coding and Delivery / [ed] Assunção, Pedro Amado, Gotchev, Atanas, Cham: Springer, 2018, p. 143-176Chapter in book (Refereed)
  • 20.
    Damghanian, Mitra
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Depth and Angular Resolution in Plenoptic Cameras2015In: 2015 IEEE International Conference On Image Processing (ICIP), September 2015, IEEE, 2015, p. 3044-3048, article id 7351362Conference paper (Refereed)
    Abstract [en]

    We present a model-based approach to extract the depth and angular resolution in a plenoptic camera. Obtained results for the depth and angular resolution are validated against Zemax ray tracing results. The provided model-based approach gives the location and number of the resolvable depth planes in a plenoptic camera as well as the angular resolution with regards to disparity in pixels. The provided model-based approach is straightforward compared to practical measurements and can reflect on the plenoptic camera parameters such as the microlens f-number in contrast with the principal-ray-model approach. Easy and accurate quantification of different resolution terms forms the basis for designing the capturing setup and choosing a reasonable system configuration for plenoptic cameras. Results from this work will accelerate customization of the plenoptic cameras for particular applications without the need for expensive measurements.

    Download full text (pdf)
    fulltext
  • 21.
    Damghanian, Mitra
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Extraction of the lateral resolution in a plenoptic camera using the SPC model2012In: 2012 International Conference on 3D Imaging, IC3D 2012 - Proceedings, IEEE conference proceedings, 2012, p. Art. no. 6615137-Conference paper (Refereed)
    Abstract [en]

    Established capturing properties like image resolution need to be described thoroughly in complex multidimensional capturing setups such as plenoptic cameras (PC), as these introduce a trade-off between resolution and features such as field of view, depth of field, and signal to noise ratio. Models, methods and metrics that assist exploring and formulating this trade-off are highly beneficial for study as well as design of complex capturing systems. This work presents how the important high-level property lateral resolution is extracted from our previously proposed Sampling Pattern Cube (SPC) model. The SPC carries ray information as well as focal properties of the capturing system it models. The proposed operator extracts the lateral resolution from the SPC model throughout an arbitrary number of depth planes resulting in a depth-resolution profile. We have validated the resolution operator by comparing the achieved lateral resolution with previous results from more simple models and from wave optics based Monte Carlo simulations. The lateral resolution predicted by the SPC model agrees with the results from wave optics based numerical simulations and strengthens the conclusion that the SPC fills the gap between ray-based models and wave optics based models, by including the focal information of the system as a model parameter. The SPC is proven a simple yet efficient model for extracting the depth-based lateral resolution as a high-level property of complex plenoptic capturing system.

    Download full text (pdf)
    fulltext
  • 22.
    Damghanian, Mitra
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Performance analysis in Lytro camera: Empirical and model based approaches to assess refocusing quality2014In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, IEEE conference proceedings, 2014, p. 559-563Conference paper (Refereed)
    Abstract [en]

    In this paper we investigate the performance of Lytro camera in terms of its refocusing quality. The refocusing quality of the camera is related to the spatial resolution and the depth of field as the contributing parameters. We quantify the spatial resolution profile as a function of depth using empirical and model based approaches. The depth of field is then determined by thresholding the spatial resolution profile. In the model based approach, the previously proposed sampling pattern cube (SPC) model for representation and evaluation of the plenoptic capturing systems is utilized. For the experimental resolution measurements, camera evaluation results are extracted from images rendered by the Lytro full reconstruction rendering method. Results from both the empirical and model based approaches assess the refocusing quality of the Lytro camera consistently, highlighting the usability of the model based approaches for performance analysis of complex capturing systems.

    Download full text (pdf)
    fulltext
  • 23.
    Damghanian, Mitra
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    The Sampling Pattern Cube: A Representation and Evaluation Tool for Optical Capturing Systems2012In: Advanced Concepts for Intelligent Vision Systems / [ed] Blanc-Talon, Jacques, Philips, Wilfried, Popescu, Dan, Scheunders, Paul, Zemcík, Pavel, Berlin / Heidelberg: Springer Berlin/Heidelberg, 2012, , p. 12p. 120-131Conference paper (Refereed)
    Abstract [en]

    Knowledge about how the light field is sampled through a camera system gives the required information to investigate interesting camera parameters. We introduce a simple and handy model to look into the sampling behavior of a camera system. We have applied this model to single lens system as well as plenoptic cameras. We have investigated how camera parameters of interest are interpreted in our proposed model-based representation. This model also enables us to make comparisons between capturing systems or to investigate how variations in an optical capturing system affect its sampling behavior.

  • 24.
    Damghanian, Mitra
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Erdmann, Arne
    Raytrix Gmbh.
    Perwass, Christian
    Raytrix Gmbh.
    Spatial resolution in a multi-focus plenoptic camera2014In: IEEE International Conference on Image Processing, ICIP 2014, IEEE conference proceedings, 2014, p. 1932-1936, article id 7025387Conference paper (Refereed)
    Abstract [en]

    Evaluation of the state of the art plenoptic cameras is necessary for design and application purposes. In this work, spatial resolution is investigated in a multi-focus plenoptic camera using two approaches: empirical and model-based. The Raytrix R29 plenoptic camera is studied which utilizes three types of micro lenses with different focal lengths in a hexagonal array structure to increase the depth of field. The modelbased approach utilizes the previously proposed sampling pattern cube (SPC) model for representation and evaluation of the plenoptic capturing systems. For the experimental resolution measurements, spatial resolution values are extracted from images reconstructed by the provided Raytrix reconstruction method. Both the measurement and the SPC model based approaches demonstrate a gradual variation of the resolution values in a wide depth range for the multi focus R29 camera. Moreover, the good agreement between the results from the model-based approach and those from the empirical approach confirms suitability of the SPC model in evaluating high-level camera parameters such as the spatial resolution in a complex capturing system as R29 multi-focus plenoptic camera.

    Download full text (pdf)
    Damghanian_Spatial_resolution
  • 25.
    Damghanian, Mitra
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Navarro Fructuoso, Hector
    Department of Optics, University of Valencia, Spain.
    Martinez Corral, Manuel
    Department of Optics, University of Valencia, Spain.
    Investigating the lateral resolution in a plenoptic capturing system using the SPC model2013In: Proceedings of SPIE - The International Society for Optical Engineering: Digital photography IX, SPIE - International Society for Optical Engineering, 2013, p. 86600T-Conference paper (Refereed)
    Abstract [en]

    Complex multidimensional capturing setups such as plenoptic cameras (PC) introduce a trade-off between various system properties. Consequently, established capturing properties, like image resolution, need to be described thoroughly for these systems. Therefore models and metrics that assist exploring and formulating this trade-off are highly beneficial for studying as well as designing of complex capturing systems. This work demonstrates the capability of our previously proposed sampling pattern cube (SPC) model to extract the lateral resolution for plenoptic capturing systems. The SPC carries both ray information as well as focal properties of the capturing system it models. The proposed operator extracts the lateral resolution from the SPC model throughout an arbitrary number of depth planes giving a depth-resolution profile. This operator utilizes focal properties of the capturing system as well as the geometrical distribution of the light containers which are the elements in the SPC model. We have validated the lateral resolution operator for different capturing setups by comparing the results with those from Monte Carlo numerical simulations based on the wave optics model. The lateral resolution predicted by the SPC model agrees with the results from the more complex wave optics model better than both the ray based model and our previously proposed lateral resolution operator. This agreement strengthens the conclusion that the SPC fills the gap between ray-based models and the real system performance, by including the focal information of the system as a model parameter. The SPC is proven a simple yet efficient model for extracting the lateral resolution as a high-level property of complex plenoptic capturing systems.

    Download full text (pdf)
    fulltext
  • 26.
    Dima, Elijs
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Brunnström, Kjell
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology. Division ICT-Acreo, RISE Research Institutes of Sweden.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Andersson, Mattias
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology. Mid Sweden University, Faculty of Science, Technology and Media, Department of Design.
    Edlund, Joakim
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Johanson, Mathias
    Alkit Communications AB, Mölndal.
    Qureshi, Tahir
    HIAB AB, Hudiksvall.
    Joint effects of depth-aiding augmentations and viewing positions on the quality of experience in augmented telepresence2020In: Quality and User Experience, ISSN 2366-0139, E-ISSN 2366-0147, Vol. 5, p. 1-17Article in journal (Refereed)
    Abstract [en]

    Virtual and augmented reality is increasingly prevalent in industrial applications, such as remote control of industrial machinery, due to recent advances in head-mounted display technologies and low-latency communications via 5G. However, the influence of augmentations and camera placement-based viewing positions on operator performance in telepresence systems remains unknown. In this paper, we investigate the joint effects of depth-aiding augmentations and viewing positions on the quality of experience for operators in augmented telepresence systems. A study was conducted with 27 non-expert participants using a real-time augmented telepresence system to perform a remote-controlled navigation and positioning task, with varied depth-aiding augmentations and viewing positions. The resulting quality of experience was analyzed via Likert opinion scales, task performance measurements, and simulator sickness evaluation. Results suggest that reducing the reliance on stereoscopic depth perception via camera placement has a significant benefit to operator performance and quality of experience. Conversely, the depth-aiding augmentations can partly mitigate the negative effects of inferior viewing positions. However the viewing-position based monoscopic and stereoscopic depth cues tend to dominate over cues based on augmentations. There is also a discrepancy between the participants’ subjective opinions on augmentation helpfulness, and its observed effects on positioning task performance.

    Download full text (pdf)
    fulltext
  • 27.
    Dima, Elijs
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Brunnström, Kjell
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology. RISE Research Institutes of Sweden, Division ICT - Acreo.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Andersson, Mattias
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Edlund, Joakim
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Johanson, Mathias
    Alkit Communications AB.
    Qureshi, Tahir
    HIAB AB.
    View Position Impact on QoE in an Immersive Telepresence System for Remote Operation2019In: 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), IEEE, 2019, p. 1-3Conference paper (Refereed)
    Abstract [en]

    In this paper, we investigate how different viewing positions affect a user's Quality of Experience (QoE) and performance in an immersive telepresence system. A QoE experiment has been conducted with 27 participants to assess the general subjective experience and the performance of remotely operating a toy excavator. Two view positions have been tested, an overhead and a ground-level view, respectively, which encourage reliance on stereoscopic depth cues to different extents for accurate operation. Results demonstrate a significant difference between ground and overhead views: the ground view increased the perceived difficulty of the task, whereas the overhead view increased the perceived accomplishment as well as the objective performance of the task. The perceived helpfulness of the overhead view was also significant according to the participants.

    Download full text (pdf)
    Dima2019ViewPositionImpact
  • 28.
    Dima, Elijs
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Gao, Yuan
    Institute of Computer Science, Christian-Albrechts University of Kiel, Germany.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Koch, Reinhard
    Institute of Computer Science, Christian-Albrechts University of Kiel, Germany.
    Esquivel, Sandro
    Institute of Computer Science, Christian-Albrechts University of Kiel, Germany.
    Estimation and Post-Capture Compensation of Synchronization Error in Unsynchronized Multi-Camera Systems2021Report (Other academic)
    Abstract [en]

    Multi-camera systems are used in entertainment production, computer vision, industry and surveillance. The benefit of using multi-camera systems is the ability to recover the 3D structure, or depth, of the recorded scene. However, various types of cameras, including depth cameras, can not be reliably synchronized during recording, which leads to errors in depth estimation and scene rendering. The aim of this work is to propose a method for compensating synchronization errors in already recorded sequences, without changing the format of the recorded sequences. We describe a depth uncertainty model for parametrizing the impact of synchronization errors in a multi-camera system, and propose a method for synchronization error estimation and compensation. The proposed method is based on interpolating an image at a desired timeframe based on adjacent non-synchronized images in a single camera's sequence, using an array of per-pixel distortion vectors. This array is generated by using the difference between adjacent images to locate and segment the recorded moving objects, and does not require any object texture or distinguishing features beyond the observed difference in adjacent images. The proposed compensation method is compared with optical-flow based interpolation and sparse correspondence based morphing, and the proposed synchronization error estimation is compared with a state-of-the-art video alignment method. The proposed method shows better synchronization error estimation accuracy and compensation ability, especially in cases of low-texture, low-feature images. The effect of using data with synchronization errors is also demonstrated, as is the improvement gained by using compensated data. The compensation of synchronization errors is useful in scenarios where the recorded data is expected to be used by other processes that expect a sub-frame synchronization accuracy, such as depth-image-based rendering.

    Download full text (pdf)
    fulltext
  • 29.
    Dima, Elijs
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Camera and Lidar-based View Generation for Augmented Remote Operation in Mining Applications2021In: IEEE Access, E-ISSN 2169-3536, Vol. 9, p. 82199-82212Article in journal (Refereed)
    Abstract [en]

    Remote operation of diggers, scalers, and other tunnel-boring machines has significant benefits for worker safety in underground mining. Real-time augmentation of the presented remote views can further improve the operator effectiveness through a more complete presentation of relevant sections of the remote location. In safety-critical applications, such augmentation cannot depend on preconditioned data, nor generate plausible-looking yet inaccurate sections of the view. In this paper, we present a capture and rendering pipeline for real time view augmentation and novel view synthesis that depends only on the inbound data from lidar and camera sensors. We suggest an on-the-fly lidar filtering for reducing point oscillation at no performance cost, and a full rendering process based on lidar depth upscaling and in-view occluder removal from the presented scene. Performance assessments show that the proposed solution is feasible for real-time applications, where per-frame processing fits within the constraints set by the inbound sensor data and within framerate tolerances for enabling effective remote operation.

  • 30.
    Dima, Elijs
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Assessment of Multi-Camera Calibration Algorithms for Two-Dimensional Camera Arrays Relative to Ground Truth Position and Direction2016In: 3DTV-Conference, IEEE Computer Society, 2016, article id 7548887Conference paper (Refereed)
    Abstract [en]

    Camera calibration methods are commonly evaluated on cumulative reprojection error metrics, on disparate one-dimensional da-tasets. To evaluate calibration of cameras in two-dimensional arrays, assessments need to be made on two-dimensional datasets with constraints on camera parameters. In this study, accuracy of several multi-camera calibration methods has been evaluated on camera parameters that are affecting view projection the most. As input data, we used a 15-viewpoint two-dimensional dataset with intrinsic and extrinsic parameter constraints and extrinsic ground truth. The assessment showed that self-calibration methods using structure-from-motion reach equal intrinsic and extrinsic parameter estimation accuracy with standard checkerboard calibration algorithm, and surpass a well-known self-calibration toolbox, BlueCCal. These results show that self-calibration is a viable approach to calibrating two-dimensional camera arrays, but improvements to state-of-art multi-camera feature matching are necessary to make BlueCCal as accurate as other self-calibration methods for two-dimensional camera arrays.

    Download full text (pdf)
    AssessmentOfMultiCameraCalibrationAlgorithms
  • 31.
    Dima, Elijs
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Modeling Depth Uncertainty of Desynchronized Multi-Camera Systems2017In: 2017 International Conference on 3D Immersion (IC3D), IEEE, 2017Conference paper (Refereed)
    Abstract [en]

    Accurately recording motion from multiple perspectives is relevant for recording and processing immersive multi-media and virtual reality content. However, synchronization errors between multiple cameras limit the precision of scene depth reconstruction and rendering. In order to quantify this limit, a relation between camera de-synchronization, camera parameters, and scene element motion has to be identified. In this paper, a parametric ray model describing depth uncertainty is derived and adapted for the pinhole camera model. A two-camera scenario is simulated to investigate the model behavior and how camera synchronization delay, scene element speed, and camera positions affect the system's depth uncertainty. Results reveal a linear relation between synchronization error, element speed, and depth uncertainty. View convergence is shown to affect mean depth uncertainty up to a factor of 10. Results also show that depth uncertainty must be assessed on the full set of camera rays instead of a central subset.

    Download full text (pdf)
    fulltext
  • 32.
    Dima, Elijs
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Kjellqvist, Martin
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Litwic, Lukasz
    Ericsson AB.
    Zhang, Zhi
    Ericsson AB.
    Rasmusson, Lennart
    Observit AB.
    Flodén, Lars
    Observit AB.
    LIFE: A Flexible Testbed For Light Field Evaluation2018Conference paper (Refereed)
    Abstract [en]

    Recording and imaging the 3D world has led to the use of light fields. Capturing, distributing and presenting light field data is challenging, and requires an evaluation platform. We define a framework for real-time processing, and present the design and implementation of a light field evaluation system. In order to serve as a testbed, the system is designed to be flexible, scalable, and able to model various end-to-end light field systems. This flexibility is achieved by encapsulating processes and devices in discrete framework systems. The modular capture system supports multiple camera types, general-purpose data processing, and streaming to network interfaces. The cloud system allows for parallel transcoding and distribution of streams. The presentation system encapsulates rendering and display specifics. The real-time ability was tested in a latency measurement; the capture and presentation systems process and stream frames within a 40 ms limit.

    Download full text (pdf)
    fulltext
  • 33. Djukic, D.
    et al.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Dutoit, B.
    Preisach-type hysteresis modelling in Bi-2223 tapes1997In: Applied Superconductivity 1997.: Proceedings of EUCAS 1997 Third European Conference on Applied Superconductivity, Vol. 2, 1997, p. 1409-1412Conference paper (Other scientific)
  • 34.
    Domanski, Marek
    et al.
    Poznan University, Poland.
    Grajek, Tomasz
    Poznan University, Poland.
    Conti, Caroline
    University of Lisbon, Portugal.
    Debono, Carl James
    University of Malta, Malta.
    de Faria, Sérgio M. M.
    Institute de Telecomunicacôes and Politecico de Leiria, Portugal.
    Kovacs, Peter
    Holografika, Budapest, Hungary.
    Lucas, Luis F.R.
    Institute de Telecomunicacôes and Politecico de Leiria, Portugal.
    Nunes, Paulo
    University of Lisbon, Portugal.
    Perra, Cristian
    University of Cagliari, Italy.
    Rodrigues, Nuno M.M.
    Institute de Telecomunicacôes and Politecico de Leiria, Portugal.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Soares, Luis Ducla
    University of Lisbon, Portugal.
    Stankiewicz, Olgierd
    Poznan university, Poland.
    Emerging Imaging Technologies: Trends and Challenges2018In: 3D Visual Content Creation, Coding and Delivery / [ed] Assunção, Pedro Amado, Gotchev, Atanas, Cham: Springer, 2018, p. 5-39Chapter in book (Refereed)
  • 35. Dutoit, B.
    et al.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Stavrev, S.
    Bi(2223) Ag Sheathed Tape Ic and Exponent n Characterization and Modelling under DC Applied Magnetic Field1999In: IEEE Transactions on Applied Superconductivity, ISSN 1051-8223, Vol. 9, no 2, p. 809-812Article in journal (Refereed)
  • 36. Dutoit, B.
    et al.
    Stavrev, S.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Bi(2223) Ag sheathed tape characterisation under DC applied magnetic field1998In: Proceedings of the Seventeenth International Cryogenic Engineering Conference: ICEC 17, Bristol, UK: IOP Publishing , 1998, p. 419-422Conference paper (Other scientific)
  • 37.
    Edlund, Joakim
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Guillemot, Christine
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology. Institut National de Recherche en Informatique et en Automatique, Rennes, France.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Analysis of Top-Down Connections in Multi-Layered Convolutional Sparse Coding2021In: 2021 IEEE 23rd International Workshop on Multimedia Signal Processing (MMSP), IEEE, 2021Conference paper (Refereed)
    Abstract [en]

    Convolutional Neural Networks (CNNs) have been instrumental in the recent advances in machine learning, with applications to media applications. Multi-Layered Convolutional Sparse Coding (ML-CSC) based on a cascade of convolutional layers in which each layer can be approximately explained by the following layer can be seen as a biologically inspired framework. However, both CNNs and ML-CSC networks lack top-down information flows that are studied in neuroscience for understanding the mechanisms of the mammal cortex. A successful implementation of such top-down connections could lead to another leap in machine learning and media applications.%This study analyses the effects of a feedback connection on an ML-CSC network, considering trade-off between sparsity and reconstruction error, support recovery rate, and mutual coherence in trained dictionaries. We find that using the feedback connection during training impacts the mutual coherence of the dictionary in a way that the equivalence between the $l_0$- and $l_1$-norm is verified for a smaller range of sparsity values. Experimental results show that the use of feedback during training does not favour inference with feedback, in terms of sparse support recovery rates. However, when the sparsity constraints are given a lower weight, the use of feedback at inference time is beneficial, in terms of support recovery rates. 

  • 38.
    Eriksson, Magnus
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Rahman, S. M. Hasibur
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Fraile, Francisco
    Universitat Politècnica de València, Spain .
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Efficient Interactive Multicast over DVB-T2: Utilizing Dynamic SFNs and PARPS2013In: IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, BMSB, IEEE conference proceedings, 2013, p. Art. no. 6621700-Conference paper (Refereed)
    Abstract [en]

    In the terrestrial digital TV systems DVB-T/H/T2, broadcasting is employed, meaning that all TV programs are sent over all transmitters, also where there are no viewers. This is inefficient utilization of spectrum and transmitter equipment. Applying interactive multicasting over DVB-T2 is a novel approach that would substantially reduce the spectrum required to deliver a certain amount of TV programs. Further gain would be achieved by Dynamic single-frequency network (DSFN) formations, which can be implemented using the concept of PARPS (Packet and Resource Plan Scheduling). A Zipf-law heterogeneous program selection model is suggested. For a system of four coordinated transmitters, and certain assumptions, IP multicasting over non-continuous transmission DSFN gives 1740% increase in multiuser system spectral efficiency (MSSE) in (users∙bit/s)/Hz/site as compared to broadcasting over SFN.

    Download full text (pdf)
    bmsb2013-Eriksson.pdf
  • 39.
    Gao, Shan
    et al.
    School of Science, Beijing Jiaotong University, Beijing 100044, China.
    Qu, Gangrong
    School of Science, Beijing Jiaotong University, Beijing 100044, China.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Liu, Yuhan
    School of Science, Beijing Jiaotong University, Beijing 100044, China.
    A TV regularisation sparse light field reconstruction model based on guided-filtering2022In: Signal processing. Image communication, ISSN 0923-5965, E-ISSN 1879-2677, Vol. 109, article id 116852Article in journal (Refereed)
    Abstract [en]

    Obtaining and representing the 4D light field is important for a number of computer vision applications. Due to the high dimensionality, acquiring the light field directly is costly. One way to overcome this deficiency is to reconstruct the light field from a limited number of measurements. Existing approaches involve either a depth estimation process or require a large number of measurements to obtain high-quality reconstructed results. In this paper, we propose a total variation (TV) regularisation sparse model with the alternating direction method of multipliers (ADMM) based on guided filtering, which addresses this depth-dependence problem with only a few measurements. As one of the sparse optimisation methods, TV regularisation based on ADMM is well suited to solve ill-posed problems such as this. Moreover, guided filtering has good edge-preserving smoothing properties, which can be incorporated into the light field reconstruction process. Therefore, high precision light field reconstruction is established with our model. Specifically, the updated image in the iteration step contains the guidance image, and an initialiser for the least squares method using a QR factorisation (LSQR) algorithm is involved in one of the subproblems. The model outperforms other methods in both visual assessments and objective metrics – in simulation experiments from synthetic data and photographic data using produced focal stacks from light field contents – and it works well in experiments using captured focal stacks. We also show a further application for arbitrary refocusing by using the reconstructed light field.

    The full text will be freely available from 2024-09-19 00:00
  • 40.
    Gond, Manu
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Zerman, Emin
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Knorr, Sebastian
    Ernst-Abbe University of Applied Sciences.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    LFSphereNet: Real Time Spherical Light Field Reconstruction from a Single Omnidirectional Image2023In: Proceedings of the 20th ACM SIGGRAPH European Conference on Visual Media Production, New York, NY, United States: Association for Computing Machinery (ACM), 2023, p. 1-10Conference paper (Refereed)
    Abstract [en]

    Recent developments in immersive imaging technologies have enabled improved telepresence applications. Being fully matured in the commercial sense, omnidirectional (360-degree) content provides full vision around the camera with three degrees of freedom (3DoF). Considering the applications in real-time immersive telepresence, this paper investigates how a single omnidirectional image (ODI) can be used to extend 3DoF to 6DoF. To achieve this, we propose a fully learning-based method for spherical light field reconstruction from a single omnidirectional image. The proposed LFSphereNet utilizes two different networks: The first network learns to reconstruct the light field in cubemap projection (CMP) format given the six cube faces of an omnidirectional image and the corresponding cube face positions as input. The cubemap format implies a linear re-projection, which is more appropriate for a neural network. The second network refines the reconstructed cubemaps in equirectangular projection (ERP) format by removing cubemap border artifacts. The network learns the geometric features implicitly for both translation and zooming when an appropriate cost function is employed. Furthermore, it runs with very low inference time, which enables real-time applications. We demonstrate that LFSphereNet outperforms state-of-the-art approaches in terms of quality and speed when tested on different synthetic and real world scenes. The proposed method represents a significant step towards achieving real-time immersive remote telepresence experiences.

  • 41. Grilli, F.
    et al.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Prediction of resistive and hysteretic losses in a multi-layer high-Tc superconducting cable2004In: Superconductors Science and Technology, ISSN 0953-2048, E-ISSN 1361-6668, Vol. 17, no 3, p. 409-416Article in journal (Refereed)
    Abstract [en]

    In this work, a model of a multi-layer high-Tc superconducting (HTS) cable that computes the current distribution across layers as well as the AC loss is presented. Analyzed is the case of a four-layer cable, but the developed method can be applied to a cable with an arbitrary number of layers. The cable is modelled by an equivalent circuit consisting of the following elements: nonlinear resitances, linear self and mutual inductances, as well as nonlinear, hysteretic inductances. The first take into account the typical current-voltage relation for superconductors, the second introduce coupling among the layers and depend on the geometrical parameters of the cable, the third describe the hysteretic behaviour of superconductors. In the presented analysis, the geometrical dimensions of the cable are fixed, except for the pitch length and the winding orientation of the layers. These free parameters are varied in order to partition the current across the layers such that the AC loss in the superconductor is minimized. The presented model allows to evaluate rapidly the current distribution across the different layers and to compute the corresponding AC loss. The rapidity of the computation allows calculating the losses for many different configurations within a reasonable time. The model has so firstly been used for finding the pitch lengths giving an optimal current distribution across the layers and for computing the corresponding AC loss. Secondly, the model has been refined taking into account the effects of the magnetic self-field, which, especially at high currents, can sensibly reduce the transport capacity of the cable, in particular in the outer layers.

  • 42.
    Hassan, Ali
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Zhang, Tingting
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Egiazarian, Karen
    Light-Weight EPINET Architecture for Fast Light Field Disparity Estimation2022In: Light-Weight EPINET Architecture for Fast Light Field Disparity Estimation: 26-28 Sept. 2022, Shanghai, China, Shanghai, China: IEEE Signal Processing Society, 2022, p. 1-5Conference paper (Refereed)
    Abstract [en]

    Recent deep learning-based light field disparity estimation algorithms require millions of parameters, which demand high computational cost and limit the model deployment. In this paper, an investigation is carried out to analyze the effect of depthwise separable convolution and ghost modules on state-of-the-art EPINET architecture for disparity estimation. Based on this investigation, four convolutional blocks are proposed to make the EPINET architecture a fast and light-weight network for disparity estimation. The experimental results exhibit that the proposed convolutional blocks have significantly reduced the computational cost of EPINET architecture by up to a factor of 3.89, while achieving comparable disparity maps on HCI Benchmark dataset.

    Download full text (pdf)
    fulltext
  • 43.
    Jaldemark, Jimmy
    et al.
    Mid Sweden University, Faculty of Human Sciences, Institution of education.
    Anderson, Karen
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Lindberg, J. Ola
    Mid Sweden University, Faculty of Human Sciences, Institution of education.
    Persson Slumpi, Thomas
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sefyrin, Johanna
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Snyder, Kristen
    Mid Sweden University, Faculty of Human Sciences, Institution of education.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Slutrapport delprojekt 3.5.1 Forskning och forskarskolan i e-lärande2011Other (Other (popular science, discussion, etc.))
  • 44.
    Jiang, Meng
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Nnonyelu, Chibuzo Joseph
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Lundgren, Jan
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Thungström, Göran
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Gao, Shan
    Performance Comparison of Omni and Cardioid Directional Microphones for Indoor Angle of Arrival Sound Source Localization2022In: Conference Record - IEEE Instrumentation and Measurement Technology Conference, IEEE, 2022Conference paper (Refereed)
    Abstract [en]

    The sound source localization technology brings the possibility of mapping the sound source positions. In this paper, angle-of-arrival (AOA) has been chosen as the method for achieving sound source localization in an indoor enclosed environment. The dynamic environment and reverberations bring a challenge for AOA-based systems for such applications. By the acknowledgement of microphone directionality, the cardioid-directional microphone systems have been chosen for the localization performance comparison with omni-directional microphone systems, in order to investigate which microphone is superior in AOA indoor sound source localization. To reduce the hardware complexity, the number of microphones used during the experiment has been limited to 4. A localization improvement has been proposed with a weighting factor. The comparison has been done for both types of microphones with 3 different array manifolds under the same system setup. The comparison shows that the cardioid-directional microphone system has an overall higher accuracy. 

  • 45.
    Jiang, Meng
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Nnonyelu, Chibuzo Joseph
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Lundgren, Jan
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Thungström, Göran
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Engineering, Mathematics, and Science Education (2023-).
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    A Coherent Wideband Acoustic Source Localization Using a Uniform Circular Array2023In: Sensors, E-ISSN 1424-8220, Vol. 23, no 11, article id 5061Article in journal (Refereed)
    Abstract [en]

    In modern applications such as robotics, autonomous vehicles, and speaker localization, the computational power for sound source localization applications can be limited when other functionalities get more complex. In such application fields, there is a need to maintain high localization accuracy for several sound sources while reducing computational complexity. The array manifold interpolation (AMI) method applied with the Multiple Signal Classification (MUSIC) algorithm enables sound source localization of multiple sources with high accuracy. However, the computational complexity has so far been relatively high. This paper presents a modified AMI for uniform circular array (UCA) that offers reduced computational complexity compared to the original AMI. The complexity reduction is based on the proposed UCA-specific focusing matrix which eliminates the calculation of the Bessel function. The simulation comparison is done with the existing methods of iMUSIC, the Weighted Squared Test of Orthogonality of Projected Subspaces (WS-TOPS), and the original AMI. The experiment result under different scenarios shows that the proposed algorithm outperforms the original AMI method in terms of estimation accuracy and up to a 30% reduction in computation time. An advantage offered by this proposed method is the ability to implement wideband array processing on low-end microprocessors.

  • 46.
    Karbalaie, Abdolamir
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Abtahi, Farhad
    KTH; Karolinska Institutet, Stockholm, Sweden.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Event detection in surveillance videos: a review2022In: Multimedia tools and applications, ISSN 1380-7501, E-ISSN 1573-7721, Vol. 81, no 24, p. 35463-35501Article in journal (Refereed)
    Abstract [en]

    Since 2008, a variety of systems have been designed to detect events in security cameras. There are also more than a hundred journal articles and conference papers published in this field. However, no survey has focused on recognizing events in the surveillance system. Thus, motivated us to provide a comprehensive review of the different developed event detection systems. We start our discussion with the pioneering methods that used the TRECVid-SED dataset and then developed methods using VIRAT dataset in TRECVid evaluation. To better understand the designed systems, we describe the components of each method and the modifications of the existing method separately. We have outlined the significant challenges related to untrimmed security video action detection. Suitable metrics are also presented for assessing the performance of the proposed models. Our study indicated that the majority of researchers classified events into two groups on the basis of the number of participants and the duration of the event for the TRECVid-SED Dataset. Depending on the group of events, one or more models to identify all the events were used. For the VIRAT dataset, object detection models to localize the first stage activities were used throughout the work. Except one study, a 3D convolutional neural network (3D-CNN) to extract Spatio-temporal features or classifying different activities were used. From the review that has been carried, it is possible to conclude that developing an automatic surveillance event detection system requires three factors: accurate and fast object detection in the first stage to localize the activities, and classification model to draw some conclusion from the input values.

    Download full text (pdf)
    fulltext
  • 47.
    Karlsson, Linda
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Temporal filter with bilinear interpolation for ROI video coding2006Report (Other academic)
    Abstract [en]

    In videoconferencing and video over the mobile phone, themain visual information is found within limited regions ofthe video. This enables improved perceived quality byregion-of-interest coding. In this paper we introduce atemporal preprocessing filter that reuses values of theprevious frame, by which changes in the background areonly allowed for every second frame. This reduces the bitrateby 10-25% or gives an increase in average PSNR of0.29-0.98 dB. Further processing of the video sequence isnecessary for an improved re-allocation of the resources.Motion of the ROI causes absence of necessary backgrounddata at the ROI border. We conceal this by using a bilinearinterpolation between the current and previous frame at thetransition from background to ROI. This results in animprovement in average PSNR of 0.44 – 1.05 dB in thetransition area with a minor decrease in average PSNRwithin the ROI.

    Download full text (pdf)
    FULLTEXT01
  • 48.
    Karlsson, Linda
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    A preprocessing approach to ROI Video Coding using Variable Gaussian Filters and Variance in Intensity2005In: Proceedings Elmar - International Symposium Electronics in Marine, Zagreb, Croatia: IEEE conference proceedings, 2005, p. 65-68, article id 1505643Conference paper (Refereed)
    Abstract [en]

    In applications involving video over mobile phones or Internet, the limited quality depending on the transmission rate can be further improved by region-of-interest (ROI) coding. In this paper we present a preprocessing method using variable Gaussian filters controlled by a quality map indicating the distance to the ROI border that seeks to smooth the border effects between ROI and non-ROI. According to subjective tests the reduction of border effects increases the percieved quality, compared to using only one low pass filter. It also introduces a small improvement of the PSNR of the intensity component within the ROI after compression. With the compressed original sequence as a reference, the average PSNR was increased by 1.25 dB and 2.3 dB for 100 kbit/s and 150 kbit/s, respectively. Furthermore, in order to reduce computational complexity, a modified quality map is introduced using variance in intensity to exclude pixels, which are not visibly affected by the Gaussian filters. No change in quality is noticed when using less than 76% of the pixels.

  • 49.
    Karlsson, Linda
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Improved ROI Video Coding using Variable Gaussian Pre-Filters and Variance in Intensity2005In: IEEE International Conference on Image Processing 2005, ICIP 2005: Vol. 2, 2005, p. 1817-1820, article id 1530054Conference paper (Refereed)
    Abstract [en]

    In applications involving video over mobile phones or Internet, the limited quality depending on the transmission rate can be further improved by region-of-interest (ROI) coding. In this paper we present a preprocessing method using variable Gaussian filters controlled by a quality map indicating the distance to the ROI border. The border effects are reduced introducing a small improvement of the PSNR of the intensity component within the ROI after compression, compared to using only one low pass filter. With the compressed original sequence as a reference, the average PSNR was increased by 1.25 dB and 2.3 dB for 100 kbit/s and 150 kbit/s, respectively. A modified quality map is introduced using variance to exclude pixels, which are not visibly affected by the Gaussian filters, reducing computational complexity. Using less than 76% of the pixels gives no noticeable change in quality.

  • 50.
    Karlsson, Linda
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Multiview plus depth scalable coding in the depth domain2009In: 3DTV-CON 2009 - 3rd 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video, Proceedings, IEEE conference proceedings, 2009, p. 5069631-Conference paper (Refereed)
    Abstract [en]

    Three dimensional (3D) TV is a growing area that provides an extra dimension at the cost of spatial resolution. The multi-view plus depth representation provides a lower bit rate when it is encoded than multi-view and higher resolution than a 2D-plus-depth sequence. Scalable video coding provides adaption to the conditions at the receiver. In this paper we propose a scheme that combines scalability in both the view and depth domain. The center view data is preserved, whereas the data of the side views are extracted in layers depending on distance to the camera. This allows a decrease in bit rate of 16-39 % for the colour part of a 3-view MV depending number of pixels in the first enhancement layer if one layer is extracted. Each additional layer increases the visual quality and PSNR compared only using center view data.

    Download full text (pdf)
    FULLTEXT01
123 1 - 50 of 146
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf