Mid Sweden University

miun.sePublications
Change search
Refine search result
12 1 - 50 of 68
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ak, Ali
    et al.
    Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, Nantes, France.
    Zerman, Emin
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Quach, Maurice
    Université Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des Signaux et Systèmes (UMR 8506), Gif-sur-Yvette, France.
    Chetouani, Aladine
    Laboratoire PRISME, Université d'Orléans, Orléans, France.
    Smolic, Aljosa
    Lucerne University of Applied Sciences and Arts (HSLU), Rotkreuz, Switzerland.
    Valenzise, Giuseppe
    Université Paris-Saclay, CNRS, CentraleSupélec, Laboratoire des Signaux et Systèmes (UMR 8506), Gif-sur-Yvette, France.
    Le Callet, Patrick
    Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, Nantes, France.
    BASICS: Broad Quality Assessment of Static Point Clouds in a Compression Scenario2024In: IEEE transactions on multimedia, ISSN 1520-9210, E-ISSN 1941-0077Article in journal (Refereed)
    Abstract [en]

    Point clouds have become increasingly prevalent in representing 3D scenes within virtual environments, alongside 3D meshes. Their ease of capture has facilitated a wide array of applications on mobile devices, from smartphones to autonomous vehicles. Notably, point cloud compression has reached an advanced stage and has been standardized. However, the availability of quality assessment datasets, which are essential for developing improved objective quality metrics, remains limited. In this paper, we introduce BASICS, a large-scale quality assessment dataset tailored for static point clouds. The BASICS dataset comprises 75 unique point clouds, each compressed with four different algorithms including a learning-based method, resulting in the evaluation of nearly 1500 point clouds by 3500 unique participants. Furthermore, we conduct a comprehensive analysis of the gathered data, benchmark existing point cloud quality assessment metrics and identify their limitations. By publicly releasing the BASICS dataset, we lay the foundation for addressing these limitations and fostering the development of more precise quality metrics.

  • 2.
    Andersson, Håkan
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    3D Video Playback: A modular cross-platform GPU-based approach for flexible multi-view 3D video rendering2010Independent thesis Basic level (professional degree), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    The evolution of depth‐perception visualization technologies, emerging format standardization work and research within the field of multi‐view 3D video and imagery addresses the need for flexible 3D video visualization. The wide variety of available 3D‐display types and visualization techniques for multi‐view video, as well as the high throughput requirements for high definition video, addresses the need for a real‐time 3D video playback solution that takes advantage of hardware accelerated graphics, while providing a high degree of flexibility through format configuration and cross‐platform interoperability. A modular component based software solution based on FFmpeg for video demultiplexing and video decoding is proposed,using OpenGL and GLUT for hardware accelerated graphics and POSIX threads for increased CPU utilization. The solution has been verified to have sufficient throughput in order to display 1080p video at the native video frame rate on the experimental system, which is considered as a standard high‐end desktop PC only using commercial hardware. In order to evaluate the performance of the proposed solution a number of throughput evaluation metrics have been introduced measuring average frame rate as a function of: video bit rate, video resolution and number of views. The results obtained have indicated that the GPU constitutes the primary bottleneck in a multi‐view lenticular rendering system and that multi‐view rendering performance is degraded as the number of views is increased. This is a result of the current GPU square matrix texture cache architectures, resulting in texture lookup access times according to random memory access patterns when the number of views is high. The proposed solution has been identified in order to provide low CPU efficiency, i.e. low CPU hardware utilization and it is recommended to increase performance by investigating the gains of scalable multithreading techniques. It is also recommended to investigate the gains of introducing video frame buffering in video memory or to move more calculations to the CPU in order to increase GPU performance.

    Download full text (pdf)
    FULLTEXT01
  • 3.
    Brunnstrom, Kjell
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology. RISE AB Acreo.
    Barkowsky, Marcus
    Deggendorf Institute of Technology (DIT), University of Applied Sciences, Deggendorf.
    Statistical quality of experience analysis: on planning the sample size and statistical significance testing2018In: Journal of Electronic Imaging (JEI), ISSN 1017-9909, E-ISSN 1560-229X, Vol. 27, no 5, p. 053013-1-053013-11, article id 053013Article in journal (Refereed)
    Abstract [en]

    This paper analyzes how an experimenter can balance errors in subjective video quality tests betweenthe statistical power of finding an effect if it is there and not claiming that an effect is there if the effect is not there,i.e., balancing Type I and Type II errors. The risk of committing Type I errors increases with the number ofcomparisons that are performed in statistical tests. We will show that when controlling for this and at thesame time keeping the power of the experiment at a reasonably high level, it is unlikely that the number oftest subjects that are normally used and recommended by the International Telecommunication Union (ITU),i.e., 15 is sufficient but the number used by the Video Quality Experts Group (VQEG), i.e., 24 is more likelyto be sufficient. Examples will also be given for the influence of Type I error on the statistical significance ofcomparing objective metrics by correlation. We also present a comparison between parametric and nonparametricstatistics. The comparison targets the question whether we would reach different conclusions on the statisticaldifference between the video quality ratings of different video clips in a subjective test, based on thecomparison between the student T-test and the Mann–Whitney U-test. We found that there was hardly a differencewhen few comparisons are compensated for, i.e., then almost the same conclusions are reached. Whenthe number of comparisons is increased, then larger and larger differences between the two methods arerevealed. In these cases, the parametric T-test gives clearly more significant cases, than the nonparametrictest, which makes it more important to investigate whether the assumptions are met for performing a certaintest.

    Download full text (pdf)
    fulltext
  • 4. Carle, Fredrik
    et al.
    Koptioug, Andrei
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Engineering and Sustainable Development.
    Portable rescue device and a method for locating such a device2007Patent (Other (popular science, discussion, etc.))
    Abstract [en]

    A portable rescue device and a method for locating, by means of a first rescue device set in a search mode, a second rescue device set in a distress mode. In the method, a distress signal carrying a device identification is received from said second rescue device. A first bearing and a second bearing to the second rescue device are obtained. The first and second bearings are taken from a first and a second position, respectively. A distance between these positions is determined. A current distance and a current bearing to the second rescue device are determined on basis of the first and second bearings and the distance. The current bearing and the current distance are communicated to a user of the first rescue device. The portable rescue device is used for performing the method and for that purpose it includes a first communication unit for distress signal transmission and reception; a compass; a processor; a user interface; and a mode switch for switching between a search mode and a distress signal mode. The first communication device has an antenna structure that provides directional capability.

  • 5.
    Colombo, Roberto M.
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology. University of Brescia.
    Mahmood, Aamir
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sisinni, Emiliano
    Ferrari, Paolo
    Gidlund, Mikael
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Low-cost SDR-based Tool for Evaluating LoRa Satellite Communications2022In: 2022 IEEE International Symposium on Measurements and Networking, M and N 2022 - Proceedings, IEEE conference proceedings, 2022Conference paper (Refereed)
    Abstract [en]

    LoRa (Long Range) technology, with great success in providing coverage for massive Internet-of-things (IoT) deployments, is recently being considered to complement the terrestrial networks with Low Earth Orbit (LEO) satellite connectivity. The objective is to extend coverage to remote areas for various verticals, such as logistics, asset tracking, transportation, utilities, agriculture, and maritime. However, only limited studies have realistically evaluated the effects of ground-to-satellite links due to the high cost of traditional tools and methods to emulate the radio channel. In this paper, compared to an expensive channel emulator, we propose and develop an alternative method for the experimental study of LoRa satellite links using lower-cost software defined radio (SDR). Since the working details of LoRa modulation are limited to the reverse-engineered imitations, we employ such a version on SDR platform and add easily controllable adverse channel effects to evaluate LoRa for satellite connectivity. In our work, the emulation of the Doppler effect is considered as a key aspect for testing the reliability of LoRa satellite links. Therefore, after demonstrating the correctness of the (ideal) L oRa transceiver implementation, achieving a low packet error ratio (PER) with a commercial L oRa receiver, the baseband signal is distorted to emulate the Doppler effect, mimicking a real LoRa satellite communication. The Doppler effect is related to time-on-air (ToA), bounded to communication parameters and orbit height. Higher ToAs and lower orbits decrease the link duration, mainly because of dynamic Doppler effect. 

  • 6.
    Comstedt, Erik
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Effect of additional compression features on h.264 surveillance video2017Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    In video surveillance business, a recurring topic of discussion is quality versus data usage. A higher quality allows for more details to be captured at the cost of a higher bit rate, and for cameras monitoring events 24 hours a day, limiting data usage can quickly become a factor to consider. The purpose of this thesis has been to apply additional compression features to a h.264 video steam, and evaluate their effects on the videos overall quality. Using a surveillance camera, recordings of video streams were obtained. These recordings had constant GOP and frame rates. By breaking down one of these videos to an image sequence, it was possible to encode the image sequence into video streams with variable GOP/FPS using the software Ffmpeg. Additionally a user test was performed on these video streams, following the DSCQS standard from the ITU-R recom- mendation. The participants had to subjectively determine the quality of video streams. The results from the these tests showed that the participants did not no- tice any considerable difference in quality between the normal videos and the videos with variable GOP/FPS. Based of these results, the thesis has shown that that additional compression features can be applied to h.264 surveillance streams, without having a substantial effect on the video streams overall quality.

    Download full text (pdf)
    fulltext
  • 7.
    Damghanian, Mitra
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    The Sampling Pattern Cube: A Framework for Representation and Evaluation of Plenoptic Capturing Systems2013Licentiate thesis, monograph (Other academic)
    Abstract [en]

    Digital cameras have already entered our everyday life. Rapid technological advances have made it easier and cheaper to develop new cameras with unconventional structures. The plenoptic camera is one of the new devices which can capture the light information which is then able to be processed for applications such as focus adjustments. The high level camera properties, such as the spatial or angular resolution are required to evaluate and compare plenoptic cameras. With complex camera structures that introduce trade-offs between various high level camera properties, it is no longer straightforward to describe and extract these properties. Proper models, methods and metrics with the desired level of details are beneficial to describe and evaluate plenoptic camera properties.

    This thesis attempts to describe and evaluate camera properties using a model based representation of plenoptic capturing systems in favour of a unified language. The SPC model is proposed and it describes which light samples from the scene are captured by the camera system. Light samples in the SPC model carry the ray and focus information of the capturing setup. To demonstrate the capabilities of the introduced model, property extractors for lateral resolution are defined and evaluated. The lateral resolution values obtained from the introduced model are compared with the results from the ray-based model and the ground truth data. The knowledge about how to generate and visualize the proposed model and how to extract the camera properties from the model based representation of the capturing system is collated to form the SPC framework.

    The main outcomes of the thesis can be summarized in the following points: A model based representation of the light sampling behaviour of the plenoptic capturing system is introduced, which incorporates the focus information as well as the ray information. A framework is developed to generate the SPC model and to extract high level properties of the plenoptic capturing system. Results confirm that the SPC model is capable of describing the light sampling behaviour of the capturing system, and that the SPC framework is capable of extracting high level camera properties with a higher descriptive level as compared to the ray-based model. The results from the proposed model compete with those from the more elaborate wave optics model in the ranges that wave nature of the light is not dominant. The outcome of the thesis can benefit design, evaluation and comparison of the complex capturing systems.

    Download full text (pdf)
    MitraDamghanianLicThesis
  • 8.
    Damghanian, Mitra
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    The Sampling Pattern Cube: A Representation and Evaluation Tool for Optical Capturing Systems2012In: Advanced Concepts for Intelligent Vision Systems / [ed] Blanc-Talon, Jacques, Philips, Wilfried, Popescu, Dan, Scheunders, Paul, Zemcík, Pavel, Berlin / Heidelberg: Springer Berlin/Heidelberg, 2012, , p. 12p. 120-131Conference paper (Refereed)
    Abstract [en]

    Knowledge about how the light field is sampled through a camera system gives the required information to investigate interesting camera parameters. We introduce a simple and handy model to look into the sampling behavior of a camera system. We have applied this model to single lens system as well as plenoptic cameras. We have investigated how camera parameters of interest are interpreted in our proposed model-based representation. This model also enables us to make comparisons between capturing systems or to investigate how variations in an optical capturing system affect its sampling behavior.

  • 9.
    Damghanian, Mitra
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Erdmann, Arne
    Raytrix Gmbh.
    Perwass, Christian
    Raytrix Gmbh.
    Spatial resolution in a multi-focus plenoptic camera2014In: IEEE International Conference on Image Processing, ICIP 2014, IEEE conference proceedings, 2014, p. 1932-1936, article id 7025387Conference paper (Refereed)
    Abstract [en]

    Evaluation of the state of the art plenoptic cameras is necessary for design and application purposes. In this work, spatial resolution is investigated in a multi-focus plenoptic camera using two approaches: empirical and model-based. The Raytrix R29 plenoptic camera is studied which utilizes three types of micro lenses with different focal lengths in a hexagonal array structure to increase the depth of field. The modelbased approach utilizes the previously proposed sampling pattern cube (SPC) model for representation and evaluation of the plenoptic capturing systems. For the experimental resolution measurements, spatial resolution values are extracted from images reconstructed by the provided Raytrix reconstruction method. Both the measurement and the SPC model based approaches demonstrate a gradual variation of the resolution values in a wide depth range for the multi focus R29 camera. Moreover, the good agreement between the results from the model-based approach and those from the empirical approach confirms suitability of the SPC model in evaluating high-level camera parameters such as the spatial resolution in a complex capturing system as R29 multi-focus plenoptic camera.

    Download full text (pdf)
    Damghanian_Spatial_resolution
  • 10.
    Damghanian, Mitra
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Navarro Fructuoso, Hector
    Department of Optics, University of Valencia, Spain.
    Martinez Corral, Manuel
    Department of Optics, University of Valencia, Spain.
    Investigating the lateral resolution in a plenoptic capturing system using the SPC model2013In: Proceedings of SPIE - The International Society for Optical Engineering: Digital photography IX, SPIE - International Society for Optical Engineering, 2013, p. 86600T-Conference paper (Refereed)
    Abstract [en]

    Complex multidimensional capturing setups such as plenoptic cameras (PC) introduce a trade-off between various system properties. Consequently, established capturing properties, like image resolution, need to be described thoroughly for these systems. Therefore models and metrics that assist exploring and formulating this trade-off are highly beneficial for studying as well as designing of complex capturing systems. This work demonstrates the capability of our previously proposed sampling pattern cube (SPC) model to extract the lateral resolution for plenoptic capturing systems. The SPC carries both ray information as well as focal properties of the capturing system it models. The proposed operator extracts the lateral resolution from the SPC model throughout an arbitrary number of depth planes giving a depth-resolution profile. This operator utilizes focal properties of the capturing system as well as the geometrical distribution of the light containers which are the elements in the SPC model. We have validated the lateral resolution operator for different capturing setups by comparing the results with those from Monte Carlo numerical simulations based on the wave optics model. The lateral resolution predicted by the SPC model agrees with the results from the more complex wave optics model better than both the ray based model and our previously proposed lateral resolution operator. This agreement strengthens the conclusion that the SPC fills the gap between ray-based models and the real system performance, by including the focal information of the system as a model parameter. The SPC is proven a simple yet efficient model for extracting the lateral resolution as a high-level property of complex plenoptic capturing systems.

    Download full text (pdf)
    fulltext
  • 11.
    Dima, Elijs
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Assessment of Multi-Camera Calibration Algorithms for Two-Dimensional Camera Arrays Relative to Ground Truth Position and Direction2016In: 3DTV-Conference, IEEE Computer Society, 2016, article id 7548887Conference paper (Refereed)
    Abstract [en]

    Camera calibration methods are commonly evaluated on cumulative reprojection error metrics, on disparate one-dimensional da-tasets. To evaluate calibration of cameras in two-dimensional arrays, assessments need to be made on two-dimensional datasets with constraints on camera parameters. In this study, accuracy of several multi-camera calibration methods has been evaluated on camera parameters that are affecting view projection the most. As input data, we used a 15-viewpoint two-dimensional dataset with intrinsic and extrinsic parameter constraints and extrinsic ground truth. The assessment showed that self-calibration methods using structure-from-motion reach equal intrinsic and extrinsic parameter estimation accuracy with standard checkerboard calibration algorithm, and surpass a well-known self-calibration toolbox, BlueCCal. These results show that self-calibration is a viable approach to calibrating two-dimensional camera arrays, but improvements to state-of-art multi-camera feature matching are necessary to make BlueCCal as accurate as other self-calibration methods for two-dimensional camera arrays.

    Download full text (pdf)
    AssessmentOfMultiCameraCalibrationAlgorithms
  • 12.
    Dima, Elijs
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Modeling Depth Uncertainty of Desynchronized Multi-Camera Systems2017In: 2017 International Conference on 3D Immersion (IC3D), IEEE, 2017Conference paper (Refereed)
    Abstract [en]

    Accurately recording motion from multiple perspectives is relevant for recording and processing immersive multi-media and virtual reality content. However, synchronization errors between multiple cameras limit the precision of scene depth reconstruction and rendering. In order to quantify this limit, a relation between camera de-synchronization, camera parameters, and scene element motion has to be identified. In this paper, a parametric ray model describing depth uncertainty is derived and adapted for the pinhole camera model. A two-camera scenario is simulated to investigate the model behavior and how camera synchronization delay, scene element speed, and camera positions affect the system's depth uncertainty. Results reveal a linear relation between synchronization error, element speed, and depth uncertainty. View convergence is shown to affect mean depth uncertainty up to a factor of 10. Results also show that depth uncertainty must be assessed on the full set of camera rays instead of a central subset.

    Download full text (pdf)
    fulltext
  • 13.
    Du, Shiyu Sandy
    et al.
    Beihang University, Beijing, China.
    Wong, Kainam Thomas
    Beihang University, Beijing, China.
    Song, Yang
    Nanyang Technological University, Singapore.
    Nnonyelu, Chibuzo Joseph
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Wu, Yue Ivan
    Sichuan University, Chengdu, Sichuan, China.
    Higher-order figure-8 microphones/hydrophones collocated as a perpendicular triad—Their “spatial-matched-filter” beam steering2022In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 151, no 2, p. 1158-1170Article in journal (Refereed)
    Abstract [en]

    Directional sensors, if collocated but perpendicularly oriented among themselves, would facilitate signal processing to uncouple the azimuth-polar direction from the time-frequency dimension—in addition to the physical advantage of spatial compactness. One such acoustical sensing unit is the well-known “tri-axial velocity sensor” (also known as the “gradient sensor,” the “velocity-sensor triad,” the “acoustic vector sensor,” and the “vector hydrophone”), which comprises three identical figure-8 sensors of the first directivity-order, collocated spatially but oriented perpendicularly of each other. The directivity of the figure-8 sensors is hypothetically raised to a higher order in this analytical investigation with an innocent hope to sharpen the overall triad's directionality and steerability. Against this wishful aspiration, this paper rigorously analyzes how the directivity-order would affect the triad's “spatial-matched-filter” beam's directional steering capability, revealing which directivity-order(s) would allow the beam-pattern of full maneuverability toward any azimuthal direction and which directivity-order(s) cannot.

    Download full text (pdf)
    fulltext
  • 14.
    Edlund, Joakim
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Guillemot, Christine
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology. Institut National de Recherche en Informatique et en Automatique, Rennes, France.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Analysis of Top-Down Connections in Multi-Layered Convolutional Sparse Coding2021In: 2021 IEEE 23rd International Workshop on Multimedia Signal Processing (MMSP), IEEE, 2021Conference paper (Refereed)
    Abstract [en]

    Convolutional Neural Networks (CNNs) have been instrumental in the recent advances in machine learning, with applications to media applications. Multi-Layered Convolutional Sparse Coding (ML-CSC) based on a cascade of convolutional layers in which each layer can be approximately explained by the following layer can be seen as a biologically inspired framework. However, both CNNs and ML-CSC networks lack top-down information flows that are studied in neuroscience for understanding the mechanisms of the mammal cortex. A successful implementation of such top-down connections could lead to another leap in machine learning and media applications.%This study analyses the effects of a feedback connection on an ML-CSC network, considering trade-off between sparsity and reconstruction error, support recovery rate, and mutual coherence in trained dictionaries. We find that using the feedback connection during training impacts the mutual coherence of the dictionary in a way that the equivalence between the $l_0$- and $l_1$-norm is verified for a smaller range of sparsity values. Experimental results show that the use of feedback during training does not favour inference with feedback, in terms of sparse support recovery rates. However, when the sparsity constraints are given a lower weight, the use of feedback at inference time is beneficial, in terms of support recovery rates. 

  • 15. Ferrari, P.
    et al.
    Bellagente, P.
    Depari, A.
    Flammini, A.
    Pasetti, M.
    Rinaldi, S.
    Sisinni, E.
    Resilient time synchronization opportunistically exploiting UWB RTLS infrastructure2022In: IEEE Transactions on Instrumentation and Measurement, ISSN 0018-9456, E-ISSN 1557-9662, Vol. 71, p. 1-10Article in journal (Refereed)
    Abstract [en]

    Ultra Wide Band (UWB) based solutions for real time localization are starting to be widely diffused. They use two-way ranging scheme achieving indoor positioning accuracy well below ten centimeters. These wireless devices are based on counters with picosecond resolution, which could be used also for nodes time synchronization. This work aims to propose an opportunistic approach for transparently obtaining multiple accurate time synchronization from low-cost infrastructure of Real Time Location Systems (RTLS). After the description of the proposed approach, the idea is demonstrated using off-the-shelf Ultra Wide Band (UWB) modules from Decawave and their related software. Thanks to hardware timestamping support inside the core architecture, the realized wireless station is able to simultaneously lock and track several time references generated by the UWB module. The extensive experimental characterization evaluates both the uncertainty of the reference signal generated by the UWB receiver, and the time synchronization uncertainty of the whole host system running a Proportional Integrative (PI) control loop for locking the master reference clock. The time reference pulses are delivered by the UWB modules with a maximum jitter on the order of 40 ns, whereas the synchronization uncertainty is less than 10 ns. IEEE

  • 16.
    Gao, Shan
    et al.
    School of Science, Beijing Jiaotong University, Beijing 100044, China.
    Qu, Gangrong
    School of Science, Beijing Jiaotong University, Beijing 100044, China.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Liu, Yuhan
    School of Science, Beijing Jiaotong University, Beijing 100044, China.
    A TV regularisation sparse light field reconstruction model based on guided-filtering2022In: Signal processing. Image communication, ISSN 0923-5965, E-ISSN 1879-2677, Vol. 109, article id 116852Article in journal (Refereed)
    Abstract [en]

    Obtaining and representing the 4D light field is important for a number of computer vision applications. Due to the high dimensionality, acquiring the light field directly is costly. One way to overcome this deficiency is to reconstruct the light field from a limited number of measurements. Existing approaches involve either a depth estimation process or require a large number of measurements to obtain high-quality reconstructed results. In this paper, we propose a total variation (TV) regularisation sparse model with the alternating direction method of multipliers (ADMM) based on guided filtering, which addresses this depth-dependence problem with only a few measurements. As one of the sparse optimisation methods, TV regularisation based on ADMM is well suited to solve ill-posed problems such as this. Moreover, guided filtering has good edge-preserving smoothing properties, which can be incorporated into the light field reconstruction process. Therefore, high precision light field reconstruction is established with our model. Specifically, the updated image in the iteration step contains the guidance image, and an initialiser for the least squares method using a QR factorisation (LSQR) algorithm is involved in one of the subproblems. The model outperforms other methods in both visual assessments and objective metrics – in simulation experiments from synthetic data and photographic data using produced focal stacks from light field contents – and it works well in experiments using captured focal stacks. We also show a further application for arbitrary refocusing by using the reconstructed light field.

    The full text will be freely available from 2024-09-19 00:00
  • 17.
    Jiang, Meng
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Nnonyelu, Chibuzo Joseph
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Lundgren, Jan
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Thungström, Göran
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Gao, Shan
    Performance Comparison of Omni and Cardioid Directional Microphones for Indoor Angle of Arrival Sound Source Localization2022In: Conference Record - IEEE Instrumentation and Measurement Technology Conference, IEEE, 2022Conference paper (Refereed)
    Abstract [en]

    The sound source localization technology brings the possibility of mapping the sound source positions. In this paper, angle-of-arrival (AOA) has been chosen as the method for achieving sound source localization in an indoor enclosed environment. The dynamic environment and reverberations bring a challenge for AOA-based systems for such applications. By the acknowledgement of microphone directionality, the cardioid-directional microphone systems have been chosen for the localization performance comparison with omni-directional microphone systems, in order to investigate which microphone is superior in AOA indoor sound source localization. To reduce the hardware complexity, the number of microphones used during the experiment has been limited to 4. A localization improvement has been proposed with a weighting factor. The comparison has been done for both types of microphones with 3 different array manifolds under the same system setup. The comparison shows that the cardioid-directional microphone system has an overall higher accuracy. 

  • 18.
    Jiang, Meng
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Nnonyelu, Chibuzo Joseph
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Lundgren, Jan
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Thungström, Göran
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Engineering, Mathematics, and Science Education (2023-).
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    A Coherent Wideband Acoustic Source Localization Using a Uniform Circular Array2023In: Sensors, E-ISSN 1424-8220, Vol. 23, no 11, article id 5061Article in journal (Refereed)
    Abstract [en]

    In modern applications such as robotics, autonomous vehicles, and speaker localization, the computational power for sound source localization applications can be limited when other functionalities get more complex. In such application fields, there is a need to maintain high localization accuracy for several sound sources while reducing computational complexity. The array manifold interpolation (AMI) method applied with the Multiple Signal Classification (MUSIC) algorithm enables sound source localization of multiple sources with high accuracy. However, the computational complexity has so far been relatively high. This paper presents a modified AMI for uniform circular array (UCA) that offers reduced computational complexity compared to the original AMI. The complexity reduction is based on the proposed UCA-specific focusing matrix which eliminates the calculation of the Bessel function. The simulation comparison is done with the existing methods of iMUSIC, the Weighted Squared Test of Orthogonality of Projected Subspaces (WS-TOPS), and the original AMI. The experiment result under different scenarios shows that the proposed algorithm outperforms the original AMI method in terms of estimation accuracy and up to a 30% reduction in computation time. An advantage offered by this proposed method is the ability to implement wideband array processing on low-end microprocessors.

  • 19.
    Jonsson, Patrik
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Dobslaw, Felix
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Decision support system for variable speed regulation2012Conference paper (Refereed)
    Abstract [en]

    The problem of recommending a suitable speed limit for roads is important for road authorities in order to increase traffic safety. Nowadays, these speed limits can be given more dynamically, with digital speed regulation signs. The challenge here is input from the environment, in combination with probabilities for certain events. Here we present a decision support model based on a dynamic Bayesian network. The purpose of this model is to predict the appropriate speed on the basis of weather data, traffic density and road maintenance activities. The dynamic Bayesian network principle of using uncertainty for the involved variables gives a possibility to model the real conditions. This model shows that it is possible to develop automated decision support systems for variable speed regulation.

  • 20.
    Karbalaie, Abdolamir
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Abtahi, Farhad
    KTH; Karolinska Institutet, Stockholm, Sweden.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Event detection in surveillance videos: a review2022In: Multimedia tools and applications, ISSN 1380-7501, E-ISSN 1573-7721, Vol. 81, no 24, p. 35463-35501Article in journal (Refereed)
    Abstract [en]

    Since 2008, a variety of systems have been designed to detect events in security cameras. There are also more than a hundred journal articles and conference papers published in this field. However, no survey has focused on recognizing events in the surveillance system. Thus, motivated us to provide a comprehensive review of the different developed event detection systems. We start our discussion with the pioneering methods that used the TRECVid-SED dataset and then developed methods using VIRAT dataset in TRECVid evaluation. To better understand the designed systems, we describe the components of each method and the modifications of the existing method separately. We have outlined the significant challenges related to untrimmed security video action detection. Suitable metrics are also presented for assessing the performance of the proposed models. Our study indicated that the majority of researchers classified events into two groups on the basis of the number of participants and the duration of the event for the TRECVid-SED Dataset. Depending on the group of events, one or more models to identify all the events were used. For the VIRAT dataset, object detection models to localize the first stage activities were used throughout the work. Except one study, a 3D convolutional neural network (3D-CNN) to extract Spatio-temporal features or classifying different activities were used. From the review that has been carried, it is possible to conclude that developing an automatic surveillance event detection system requires three factors: accurate and fast object detection in the first stage to localize the activities, and classification model to draw some conclusion from the input values.

    Download full text (pdf)
    fulltext
  • 21.
    Karlsson, Linda Sofia
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    A Spatio-Temporal Filter for Region-of-Interest Video CodingManuscript (preprint) (Other academic)
    Abstract [en]

    Region of interest (ROI) video coding increases the quality in regions interesting to the viewer at the expense of quality in the background. This enables a high perceived quality at low bit rates. A successfully detected ROI can be used to control the bit-allocation in the encoding. In this paper we present a filter that is independent of codec and standard. It is applied in both the spatial and the temporal domains. The filter’s ability to reduce the number of bits necessary to encode the background is analyzed theoretically and where these bits are re-allocated. The computational complexity of the algorithms is also determined. The quality is evaluated using PSNR of the ROI and subjective tests. Test showed that the spatio-temporal filter has a better coding efficiency than using only spatial or only temporal filtering. The filter successfully re-allocates the bits from the background to the foreground.

  • 22.
    Karlsson, Linda Sofia
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Layer assignment based on depth data distribution for multiview-plus-depth scalable video coding2011In: IEEE transactions on circuits and systems for video technology (Print), ISSN 1051-8215, E-ISSN 1558-2205, Vol. 21, no 6, p. 742-754Article in journal (Refereed)
    Abstract [en]

    Three dimensional (3D) video is experiencing a rapid growth in a number of areas including 3D cinema, 3DTV and mobile phones. Several problems must to be addressed to display captured 3D video at another location. One problem is how torepresent the data. The multiview plus depth representation of a scene requires a lower bit rate than transmitting all views required by an application and provides more information than a 2D-plus-depth sequence. Another problem is how to handle transmission in a heterogeneous network. Scalable video coding enables adaption of a 3D video sequence to the conditions at the receiver. In this paper we present a scheme that combines scalability based on the position in depth of the data and the distance to the center view. The general scheme preserves the center view data, whereas the data of the remaining views are extracted in enhancement layers depending on distance to the viewer and the center camera. The data is assigned into enhancement layers within a view based on depth data distribution. Strategies concerning the layer assignment between adjacent views are proposed. In general each extracted enhancement layer increases the visual quality and PSNR compared to only using center view data. The bit-rate per layer can be further decreased if depth data is distributed over the enhancement layers. The choice of strategy to assign layers between adjacent views depends on whether quality of the fore-most objects in the scene or the quality of the views close to the center is important.

    Download full text (pdf)
    FULLTEXT01
  • 23.
    Li, Yongwei
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Pla, Filiberto
    University Jaume I, Castellón de la Plana, Spain.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Fernandez-Beltran, Ruben
    University of Murcia, Murcia, Spain.
    Simultaneous Color Restoration and Depth Estimation in Light Field Imaging2022In: IEEE Access, E-ISSN 2169-3536, Vol. 10, p. 49599-49610Article in journal (Refereed)
    Abstract [en]

    Recent studies in the light field imaging have shown the potential and advantages of different light field information processes. In most of the existing techniques, the processing pipeline of light field has been treated in a step-by-step manner, and each step is considered to be independent from the others. For example, in light field color demosaicing, inferring the scene geometry is treated as an irrelevant and negligible task, and vice versa. Such processing techniques may fail due to the inherent connection among different steps, and result in both corrupted post-processing and defective pre-processing results. In this paper, we address the interaction between color interpolation and depth estimation in light field, and propose a probabilistic approach to handle these two processing steps jointly. This probabilistic framework is based on a Markov Random Fields —Collaborative Graph Model for simultaneous Demosaicing and Depth Estimation (CGMDD)—to explore the color-depth interdependence from general light field sampling. Experimental results show that both image interpolation quality and depth estimation can benefit from their interaction, mainly for processes such as image demosaicing which are shown to be sensitive to depth information, especially for light field sampling with large baselines.

    Download full text (pdf)
    fulltext
  • 24.
    Li, Yun
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Coding of three-dimensional video content: Depth image coding by diffusion2013Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Three-dimensional (3D) movies in theaters have become a massive commercial success during recent years, and it is likely that, with the advancement of display technologies and the production of 3D contents, TV broadcasting in 3D will play an important role in home entertainments in the not too distant future. 3D video contents contain at least two views from different perspectives for the left and the right eye of viewers. The amount of coded information is doubled if these views are encoded separately. Moreover, for multi-view displays (i.e. different perspectives of a scene in 3D are presented to the viewer at the same time through different angles), either video streams of all the required views must be transmitted to the receiver, or the displays must synthesize the missing views with a subset of the views. The latter approach has been widely proposed to reduce the amount of data being transmitted. The virtual views can be synthesized by the Depth Image Based Rendering (DIBR) approach from textures and associated depth images. However it is still the case that the amount of information for the textures plus the depths presents a significant challenge for the network transmission capacity. An efficient compression will, therefore, increase the availability of content access and provide a better video quality under the same network capacity constraints.

    In this thesis, the compression of depth images is addressed. These depth images can be assumed as being piece-wise smooth. Starting from the properties of depth images, a novel depth image model based on edges and sparse samples is presented, which may also be utilized for depth image post-processing. Based on this model, a depth image coding scheme that explicitly encodes the locations of depth edges is proposed, and the coding scheme has a scalable structure. Furthermore, a compression scheme for block-based 3D-HEVC is also devised, in which diffusion is used for intra prediction. In addition to the proposed schemes, the thesis illustrates several evaluation methodologies, especially, the subjective test of the stimulus-comparison method. It is suitable for evaluating the quality of two impaired images, as the objective metrics are inaccurate with respect to synthesized views.

    The MPEG test sequences were used for the evaluation. The results showed that virtual views synthesized from post-processed depth images by using the proposed model are better than those synthesized from original depth images. More importantly, the proposed coding schemes using such a model produced better synthesized views than the state of the art schemes. As a result, the outcome of the thesis can lead to a better quality of 3DTV experience.

    Download full text (pdf)
    fulltext
  • 25.
    Li, Yun
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Coding of Three-dimensional Video Content: Diffusion-based Coding of Depth Images and Displacement Intra-Coding of Plenoptic Contents2015Doctoral thesis, monograph (Other academic)
    Abstract [en]

    In recent years, the three-dimensional (3D) movie industry has reaped massive commercial success in the theaters. With the advancement of display technologies, more experienced capturing and generation of 3D contents, TV broadcasting, movies, and games in 3D have entered home entertainment, and it is likely that 3D applications will play an important role in many aspects of people's life in a not distant future. 3D video contents contain at least two views from different perspectives for the left and the right eye of viewers. The amount of coded information is doubled if these views are encoded separately. Moreover, for multi-view displays (i.e. different perspectives of a scene in 3D are presented to the viewer at the same time through different angles), either video streams of all the required views must be transmitted to the receiver, or the displays must synthesize the missing views with a subset of the views. The latter approach has been widely proposed to reduce the amount of data being transmitted and make data adjustable to 3D-displays. The virtual views can be synthesized by the Depth Image Based Rendering (DIBR) approach from textures and associated depth images. However, it is still the case that the amount of information for the textures plus the depths presents a significant challenge for the network transmission capacity. Compression techniques are vital to facilitate the transmission. In addition to multi-view and multi-view plus depth for reproducing 3D, light field techniques have recently become a hot topic. The light field capturing aims at acquiring not only spatial but also angular information of a view, and an ideal light field rendering device should be such that the viewers would perceive it as looking through a window. Thus, the light field techniques are a step forward to provide us with a more authentic perception of 3D. Among many light field capturing approaches, focused plenoptic capturing is a solution that utilize microlens arrays. The plenoptic cameras are also portable and commercially available. Multi-view and refocusing can be obtained during post-production from these cameras. However, the captured plenoptic images are of a large size and contain significant amount of a redundant information. An efficient compression of the above mentioned contents will, therefore, increase the availability of content access and provide a better quality experience under the same network capacity constraints. In this thesis, the compression of depth images and of plenoptic contents captured by focused plenoptic cameras are addressed. The depth images can be assumed to be piece-wise smooth. Starting from the properties of depth images, a novel depth image model based on edges and sparse samples is presented, which may also be utilized for depth image post-processing. Based on this model, a depth image coding scheme that explicitly encodes the locations of depth edges is proposed, and the coding scheme has a scalable structure. Furthermore, a compression scheme for block-based 3D-HEVC is also devised, in which diffusion is used for intra prediction. In addition to the proposed schemes, the thesis illustrates several evaluation methodologies, especially the subjective test of the stimulus-comparison method. This is suitable for evaluating the quality of two impaired images, as the objective metrics are inaccurate with respect to synthesized views. For the compression of plenoptic contents, displacement intra prediction with more than one hypothesis is applied and implemented in the HEVC for an efficient prediction. In addition, a scalable coding approach utilizing a sparse set and disparities is introduced for the coding of focused plenoptic images. The MPEG test sequences were used for the evaluation of the proposed depth image compression, and public available plenoptic image and video contents were applied to the assessment of the proposed plenoptic compression. For depth image coding, the results showed that virtual views synthesized from post-processed depth images by using the proposed model are better than those synthesized from original depth images. More importantly, the proposed coding schemes using such a model produced better synthesized views than the state of the art schemes. For the plenoptic contents, the proposed scheme achieved an efficient prediction and reduced the bit rate significantly while providing coding and rendering scalability. As a result, the outcome of the thesis can lead to improving quality of the 3DTV experience and facilitate the development of 3D applications in general.

    Download full text (pdf)
    fulltext
  • 26.
    Li, Yun
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Jennehag, Ulf
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Depth Image Post-processing Method by Diffusion2013In: Proceedings of SPIE-The International Society for Optical Engineering: 3D Image Processing (3DIP) and Applications, SPIE - International Society for Optical Engineering, 2013, p. Art. no. 865003-Conference paper (Refereed)
    Abstract [en]

    Multi-view three-dimensional television relies on view synthesis to reduce the number of views being transmitted.  Arbitrary views can be synthesized by utilizing corresponding depth images with textures. The depth images obtained from stereo pairs or range cameras may contain erroneous values, which entail artifacts in a rendered view. Post-processing of the data may then be utilized to enhance the depth image with the purpose to reach a better quality of synthesized views. We propose a Partial Differential Equation (PDE)-based interpolation method for a reconstruction of the smooth areas in depth images, while preserving significant edges. We modeled the depth image by adjusting thresholds for edge detection and a uniform sparse sampling factor followed by the second order PDE interpolation. The objective results show that a depth image processed by the proposed method can achieve a better quality of synthesized views than the original depth image. Visual inspection confirmed the results.

    Download full text (pdf)
    fulltext
  • 27.
    Li, Yun
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Jennehag, Ulf
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Tourancheau, Sylvain
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Subjective Evaluation of an Edge-based Depth Image Compression Scheme2013In: Proceedings of SPIE - The International Society for Optical Engineering: Stereoscopic Displays and Applications XXIV, SPIE - International Society for Optical Engineering, 2013, p. Art. no. 86480D-Conference paper (Refereed)
    Abstract [en]

    Multi-view three-dimensional television requires many views, which may be synthesized from two-dimensional images with accompanying pixel-wise depth information. This depth image, which typically consists of smooth areas and sharp transitions at object borders, must be consistent with the acquired scene in order for synthesized views to be of good quality. We have previously proposed a depth image coding scheme that preserves significant edges and encodes smooth areas between these. An objective evaluation considering the structural similarity (SSIM) index for synthesized views demonstrated an advantage to the proposed scheme over the High Efficiency Video Coding (HEVC) intra mode in certain cases. However, there were some discrepancies between the outcomes from the objective evaluation and from our visual inspection, which motivated this study of subjective tests. The test was conducted according to ITU-R BT.500-13 recommendation with Stimulus-comparison methods. The results from the subjective test showed that the proposed scheme performs slightly better than HEVC with statistical significance at majority of the tested bit rates for the given contents.

    Download full text (pdf)
    Li_Subjective_evaluation
  • 28.
    Li, Yun
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Coding of plenoptic images by using a sparse set and disparities2015In: Proceedings - IEEE International Conference on Multimedia and Expo, IEEE conference proceedings, 2015, p. -Art. no. 7177510Conference paper (Refereed)
    Abstract [en]

    A focused plenoptic camera not only captures the spatial information of a scene but also the angular information. The capturing results in a plenoptic image consisting of multiple microlens images and with a large resolution. In addition, the microlens images are similar to their neighbors. Therefore, an efficient compression method that utilizes this pattern of similarity can reduce coding bit rate and further facilitate the usage of the images. In this paper, we propose an approach for coding of focused plenoptic images by using a representation, which consists of a sparse plenoptic image set and disparities. Based on this representation, a reconstruction method by using interpolation and inpainting is devised to reconstruct the original plenoptic image. As a consequence, instead of coding the original image directly, we encode the sparse image set plus the disparity maps and use the reconstructed image as a prediction reference to encode the original image. The results show that the proposed scheme performs better than HEVC intra with more than 5 dB PSNR or over 60 percent bit rate reduction.

    Download full text (pdf)
    fulltext
  • 29.
    Li, Yun
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Jennehag, Ulf
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Coding of focused plenoptic contents by displacement intra prediction2016In: IEEE transactions on circuits and systems for video technology (Print), ISSN 1051-8215, E-ISSN 1558-2205, Vol. 26, no 7, p. 1308-1319, article id 7137669Article in journal (Refereed)
    Abstract [en]

    A light field is commonly described by a two-plane representation with four dimensions. Refocused three-dimensional contents can be rendered from light field images. A method for capturing these images is by using cameras with microlens arrays. A dense sampling of the light field results in large amounts of redundant data. Therefore, an efficient compression is vital for a practical use of these data. In this paper, we propose a displacement intra prediction scheme with a maximum of two hypotheses for the compression of plenoptic contents from focused plenoptic cameras. The proposed scheme is further implemented into HEVC. The work is aiming at coding plenoptic captured contents efficiently without knowing underlying camera geometries. In addition, the theoretical analysis of the displacement intra prediction for plenoptic images is explained; the relationship between the compressed captured images and their rendered quality is also analyzed. Evaluation results show that plenoptic contents can be efficiently compressed by the proposed scheme. Bit rate reduction up to 60 percent over HEVC is obtained for plenoptic images, and more than 30 percent is achieved for the tested video sequences.

    Download full text (pdf)
    fulltext
  • 30.
    Li, Yun
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Jennehag, Ulf
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Efficient Intra Prediction Scheme For Light Field Image Compression2014In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, IEEE conference proceedings, 2014, p. Art. no. 6853654-Conference paper (Refereed)
    Abstract [en]

    Interactive photo-realistic graphics can be rendered by using light field datasets. One way of capturing the dataset is by using light field cameras with microlens arrays. The captured images contain repetitive patterns resulted from adjacent mi-crolenses. These images don't resemble the appearance of a natural scene. This dissimilarity leads to problems in light field image compression by using traditional image and video encoders, which are optimized for natural images and video sequences. In this paper, we introduce the full inter-prediction scheme in HEVC into intra-prediction for the compression of light field images. The proposed scheme is capable of performing both unidirectional and bi-directional prediction within an image. The evaluation results show that above 3 dB quality improvements or above 50 percent bit-rate saving can be achieved in terms of BD-PSNR for the proposed scheme compared to the original HEVC intra-prediction for light field images.

    Download full text (pdf)
    fulltext
  • 31.
    Li, Yun
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Jennehag, Ulf
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Scalable coding of plenoptic images by using a sparse set and disparities2016In: IEEE Transactions on Image Processing, ISSN 1057-7149, E-ISSN 1941-0042, Vol. 25, no 1, p. 80-91, article id 7321029Article in journal (Refereed)
    Abstract [en]

    One of the light field capturing techniques is the focused plenoptic capturing. By placing a microlens array in front of the photosensor, the focused plenoptic cameras capture both spatial and angular information of a scene in each microlens image and across microlens images. The capturing results in significant amount of redundant information, and the captured image is usually of a large resolution. A coding scheme that removes the redundancy before coding can be of advantage for efficient compression, transmission and rendering. In this paper, we propose a lossy coding scheme to efficiently represent plenoptic images. The format contains a sparse image set and its associated disparities. The reconstruction is performed by disparity-based interpolation and inpainting, and the reconstructed image is later employed as a prediction reference for the coding of the full plenoptic image. As an outcome of the representation, the proposed scheme inherits a scalable structure with three layers.The results show that plenoptic images are compressed efficiently with over 60 percent bit rate reduction compared to HEVC intra, and with over 20 percent compared to HEVC block copying mode.

    Download full text (pdf)
    fulltext
  • 32.
    Mahmood, Nurul
    et al.
    University of Oulu, Oulu, Finland.
    Marchenko, Nikolaj
    Robert Bosch GmbH (Germany), Stuttgart, Germany.
    Gidlund, Mikael
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Popovski, Petar
    Aalborg University, Aalborg Øst, Denmark.
    Wireless Networks and Industrial IoT: Applications, Challenges and Enablers2020Book (Refereed)
    Abstract [en]

    This book provides a comprehensive overview of the most relevant research and standardization results in the area of wireless networking for Industrial IoT, covering both critical and massive connectivity. Most chapters in this book are intended to serve as short tutorials of particular topics, highlighting the main developments and ideas, as well as giving an outlook of the upcoming research challenges.

    The book is divided into four parts. The first part focuses on challenges, enablers and standardization efforts for reliable low-latency communication in Industrial IoT networks. The next part focuses on massive IoT, which requires cost- and energy-efficient technology components to efficiently connect a massive number of low-cost IoT devices. The third part covers three enabling technologies in the context of Industrial IoT: Security, Machine Learning/Artificial Intelligence and Edge Computing. These enablers are applicable to both connectivity types, critical and massive IoT. The last part covers aspects of Industrial IoT related to connected transportation that are important in, for example, warehouse and port logistics, product delivery and transportation among industries.

    Presents a comprehensive guide to concepts and research challenges in wireless networking for Industrial IoT;Includes an introduction and overview of such topics as 3GPP standardization for Industrial IoT, Time Sensitive Networking, system dependability over wireless networks, energy-efficient wireless networks, IoT security, ML/AI for Industrial IoT and connected transportation systems;Features contributions by well-recognized experts from both academia and industry.

  • 33.
    Muddala, Suryanarayana M.
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Disocclusion Handling Using Depth-Based Inpainting2013In: Proceedings of MMEDIA 2013, The Fifth InternationalConferences on Advances in Multimedia, Venice, Italy, 2013, International Academy, Research and Industry Association (IARIA), 2013, p. 136-141Conference paper (Refereed)
    Abstract [en]

    Depth image based rendering (DIBR) plays an important role in producing virtual views using 3D-video formats such as video plus depth (V+D) and multi view-videoplus-depth (MVD). Pixel regions with non-defined values (due to disoccluded areas) are exposed when DIBR is used. In this paper, we propose a depth-based inpainting method aimed to handle Disocclusions in DIBR from V+D and MVD. Our proposed method adopts the curvature driven diffusion (CDD) model as a data term, to which we add a depth constraint. In addition, we add depth to further guide a directional priority term in the exemplar based texture synthesis. Finally, we add depth in the patch-matching step to prioritize background texture when inpainting. The proposed method is evaluated by comparing inpainted virtual views with corresponding views produced by three state-of-the-art inpainting methods as references. The evaluation shows the proposed method yielding an increased objective quality compared to the reference methods, and visual inspection further indicate an improved visual quality.

    Download full text (pdf)
    fulltext
  • 34.
    Muddala, Suryanarayana Murthy
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Free View Rendering for 3D Video: Edge-Aided Rendering and Depth-Based Image Inpainting2015Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Three Dimensional Video (3DV) has become increasingly popular with the success of 3D cinema. Moreover, emerging display technology offers an immersive experience to the viewer without the necessity of any visual aids such as 3D glasses. 3DV applications, Three Dimensional Television (3DTV) and Free Viewpoint Television (FTV) are auspicious technologies for living room environments by providing immersive experience and look around facilities. In order to provide such an experience, these technologies require a number of camera views captured from different viewpoints. However, the capture and transmission of the required number of views is not a feasible solution, and thus view rendering is employed as an efficient solution to produce the necessary number of views. Depth-image-based rendering (DIBR) is a commonly used rendering method. Although DIBR is a simple approach that can produce the desired number of views, inherent artifacts are major issues in the view rendering. Despite much effort to tackle the rendering artifacts over the years, rendered views still contain visible artifacts.

    This dissertation addresses three problems in order to improve 3DV quality: 1) How to improve the rendered view quality using a direct approach without dealing each artifact specifically. 2) How to handle disocclusions (a.k.a. holes) in the rendered views in a visually plausible manner using inpainting. 3) How to reduce spatial inconsistencies in the rendered view. The first problem is tackled by an edge-aided rendering method that uses a direct approach with one-dimensional interpolation, which is applicable when the virtual camera distance is small. The second problem is addressed by using a depth-based inpainting method in the virtual view, which reconstructs the missing texture with background data at the disocclusions. The third problem is undertaken by a rendering method that firstly inpaint occlusions as a layered depth image (LDI) in the original view, and then renders a spatially consistent virtual view.

    Objective assessments of proposed methods show improvements over the state-of-the-art rendering methods. Visual inspection shows slight improvements for intermediate views rendered from multiview videos-plus-depth, and the proposed methods outperforms other view rendering methods in the case of rendering from single view video-plus-depth. Results confirm that the proposed methods are capable of reducing rendering artifacts and producing spatially consistent virtual views.

    In conclusion, the view rendering methods proposed in this dissertation can support the production of high quality virtual views based on a limited number of input views. When used to create a multi-scopic presentation, the outcome of this dissertation can benefit 3DV technologies to improve the immersive experience.

    Download full text (pdf)
    Doctoral thesis 226
    Download (pdf)
    Errata
  • 35.
    Muddala, Suryanarayana Murthy
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Depth-Included Curvature Inpainting for Disocclusion Filling in View Synthesis2013In: International Journal On Advances in Telecommunications, ISSN 1942-2601, E-ISSN 1942-2601, Vol. 6, no 3&4, p. 132-142Article in journal (Refereed)
    Abstract [en]

    Depth-image-based-rendering (DIBR) is the commonly used for generating additional views for 3DTV and FTV using 3D video formats such as video plus depth (V+D) and multi view-video-plus-depth (MVD). The synthesized views suffer from artifacts mainly with disocclusions when DIBR is used. Depth-based inpainting methods can solve these problems plausibly. In this paper, we analyze the influence of the depth information at various steps of the depth-included curvature inpainting method. The depth-based inpainting method relies on the depth information at every step of the inpainting process: boundary extraction for missing areas, data term computation for structure propagation and in the patch matching to find best data. The importance of depth at each step is evaluated using objective metrics and visual comparison. Our evaluation demonstrates that depth information in each step plays a key role. Moreover, to what degree depth can be used in each step of the inpainting process depends on the depth distribution.

    Download full text (pdf)
    fulltext
  • 36.
    Muddala, Suryanarayana Murthy
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Depth-Based Inpainting For Disocclusion Filling2014In: 3DTV-Conference, IEEE Computer Society, 2014, p. Art. no. 6874752-Conference paper (Refereed)
    Abstract [en]

    Depth-based inpainting methods can solve disocclusion problems occurring in depth-image-based rendering. However, inpainting in this context suffers from artifacts along foreground objects due to foreground pixels in the patch matching. In this paper, we address the disocclusion problem by a refined depth-based inpainting method. The novelty is in classifying the foreground and background by using available local depth information. Thereby, the foreground information is excluded from both the source region and the target patch. In the proposed inpainting method, the local depth constraints imply inpainting only the background data and preserving the foreground object boundaries. The results from the proposed method are compared with those from the state-of-the art inpainting methods. The experimental results demonstrate improved objective quality and a better visual quality along the object boundaries.

    Download full text (pdf)
    fulltext
  • 37.
    Muddala, Suryanarayana Murthy
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Technology and Media.
    Edge-preserving depth-image-based rendering method2012In: 2012 International Conference on 3D Imaging, IC3D 2012 - Proceedings, 2012, p. Art. no. 6615113-Conference paper (Refereed)
    Abstract [en]

    Distributionof future 3DTV is likely to use supplementary depth information to a videosequence. New virtual views may then be rendered in order to adjust todifferent 3D displays. All depth-imaged-based rendering (DIBR) methods sufferfrom artifacts in the resulting images, which are corrected by differentpost-processing. The proposed method is based on fundamental principles of3D-warping. The novelty lies in how the virtual view sample values are obtainedfrom one-dimensional interpolation, where edges are preserved by introducing specificedge-pixels with information about both foreground and background data. Thisavoids fully the post-processing of filling cracks and holes. We comparedrendered virtual views of our method and of the View Synthesis ReferenceSoftware (VSRS) and analyzed the results based on typical artifacts. Theproposed method obtained better quality for photographic images and similarquality for synthetic images.

    Download full text (pdf)
    fulltext
  • 38.
    Muddala, Suryanarayana Murthy
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Tourancheau, Sylvain
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Edge-aided virtual view rendering for multiview video plus depth2013In: Proceedings of SPIE Volume 8650, Burlingame, CA, USA, 2013: 3D Image Processing (3DIP) and Applications 2013, SPIE - International Society for Optical Engineering, 2013, p. Art. no. 86500E-Conference paper (Other academic)
    Abstract [en]

    Depth-Image-Based Rendering (DIBR) of virtual views is a fundamental method in three dimensional 3-D video applications to produce dierent perspectives from texture and depth information, in particular the multi-viewplus-depth (MVD) format. Artifacts are still present in virtual views as a consequence of imperfect rendering using existing DIBR methods. In this paper, we propose an alternative DIBR method for MVD. In the proposed method we introduce an edge pixel and interpolate pixel values in the virtual view using the actual projected coordinates from two adjacent views, by which cracks and disocclusions are automatically lled. In particular, we propose a method to merge pixel information from two adjacent views in the virtual view before the interpolation; we apply a weighted averaging of projected pixels within the range of one pixel in the virtual view. We compared virtual view images rendered by the proposed method to the corresponding view images rendered by state-of-theart methods. Objective metrics demonstrated an advantage of the proposed method for most investigated media contents. Subjective test results showed preference to dierent methods depending on media content, and the test could not demonstrate a signicant dierence between the proposed method and state-of-the-art methods.

    Download full text (pdf)
    fulltext
  • 39.
    Muddala, Suryanarayana
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Spatio-Temporal Consistent Depth-Image Based Rendering Using Layered Depth Image and Inpainting2016In: EURASIP Journal on Image and Video Processing, ISSN 1687-5176, E-ISSN 1687-5281, Vol. 9, no 1, p. 1-19Article in journal (Refereed)
    Abstract [en]

    Depth-image-based rendering (DIBR) is a commonly used method for synthesizing additional views using video-plus-depth (V+D) format. A critical issue with DIBR based view synthesis is the lack of information behind foreground objects. This lack is manifested as disocclusions, holes, next to the foreground objects in rendered virtual views as a consequence of the virtual camera “seeing” behind the foreground object. The disocclusions are larger in the extrapolation case, i.e. the single camera case. Texture synthesis methods (inpainting methods) aim to fill these disocclusions by producing plausible texture content. However, virtual views inevitably exhibit both spatial and temporal inconsistencies at the filled disocclusion areas, depending on the scene content. In this paper we propose a layered depth image (LDI) approach that improves the spatio-temporal consistency. In the process of LDI generation, depth information is used to classify the foreground and background in order to form a static scene sprite from a set of neighboring frames. Occlusions in the LDI are then identified and filled using inpainting, such that no disocclusions appear when the LDI data is rendered to a virtual view. In addition to the depth information, optical flow is computed to extract the stationary parts of the scene and to classify the occlusions in the inpainting process. Experimental results demonstrate that spatio-temporal inconsistencies are significantly reduced using the proposed method. Furthermore, subjective and objective qualities are improved compared to state-of-the-art reference methods.

    Download full text (pdf)
    Fulltext Open Access
  • 40.
    Navarro, Hector
    et al.
    Department of Optics, University of Valencia, E-46100 Burjassot, Spain.
    Saavedra, Genaro
    Department of Optics, University of Valencia, E-46100 Burjassot, Spain.
    Martinez-Corral, Manuel
    Department of Optics, University of Valencia, E-46100 Burjassot, Spain.
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and System science.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and System science.
    Depth-of-field enhancement in integral imaging by selective depth-deconvolution2014In: IEEE/OSA Journal of Display Technology, ISSN 1551-319X, E-ISSN 1558-9323, Vol. 10, no 3, p. 182-188Article in journal (Refereed)
    Abstract [en]

    One of the major drawbacks of integral imaging technique is its limited depth of field. Such limitation is imposed by the numerical aperture of the microlenses. In this paper we propose a method to extend the depth of field of integral imaging systems in the reconstruction stage. The method is based on the combination of deconvolution tools and depth filtering of each elemental image using disparity map information. We demonstrate our proposal presenting digital reconstructions of a 3D scene focused at different depths with extended depth of field.

  • 41.
    Navarro-Fructuoso, Hector
    et al.
    Dept. of Optics, Univ. of Valencia, E-46100, Burjassot, Spain .
    Saavedra-Tortosa, G.
    Dept. of Optics, Univ. of Valencia, E-46100, Burjassot, Spain .
    Martinez-Corral, Manuel
    Dept. of Optics, Univ. of Valencia, E-46100, Burjassot, Spain .
    Sjöström, Mårten
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Olsson, Roger
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information and Communication systems.
    Extended depth-of-field in integral imaging by depth-dependent deconvolution2013Conference paper (Refereed)
    Abstract [en]

    Integral Imaging is a technique to obtain true color 3D images that can provide full and continuous motion parallax for several viewers. The depth of field of these systems is mainly limited by the numerical aperture of each lenslet of the microlens array. A digital method has been developed to increase the depth of field of Integral Imaging systems in the reconstruction stage. By means of the disparity map of each elemental image, it is possible to classify the objects of the scene according to their distance from the microlenses and apply a selective deconvolution for each depth of the scene. Topographical reconstructions with enhanced depth of field of a 3D scene are presented to support our proposal.

    Download full text (pdf)
    Navarro_Extended
  • 42.
    Nikonowicz, Jakub
    et al.
    Poznań University of Technology, 61-131 Poznań, Poland.
    Mahmood, Aamir
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    Gidlund, Mikael
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
    A blind signal samples detection algorithm for accurate primary user traffic estimation2020In: Sensors, E-ISSN 1424-8220, Vol. 20, no 15, p. 1-11, article id 4136Article in journal (Refereed)
    Abstract [en]

    The energy detection process for enabling opportunistic spectrum access in dynamic primary user (PU) scenarios, where PU changes state from active to inactive at random time instances, requires the estimation of several parameters ranging from noise variance and signal-to-noise ratio (SNR) to instantaneous and average PU activity. A prerequisite to parameter estimation is an accurate extraction of the signal and noise samples in a received signal time frame. In this paper, we propose a low-complexity and accurate signal samples detection algorithm as compared to well-known methods, which is also blind to the PU activity distribution. The proposed algorithm is analyzed in a semi-experimental simulation setup for its accuracy and time complexity in recognizing signal and noise samples, and its use in channel occupancy estimation, under varying occupancy and SNR of the PU signal. The results confirm its suitability for acquiring the necessary information on the dynamic behavior of PU, which is otherwise assumed to be known in the literature. 

    Download full text (pdf)
    fulltext
  • 43.
    Nnonyelu, Chibuzo Joseph
    University of Nigeria, Nsukka, Nigeria.
    Rules-of-thumb to design a uniform spherical array for direction finding—Its Cramér–Rao bounds’ nonlinear dependence on the number of sensors2019In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 145, no 2, p. 714-723Article in journal (Refereed)
    Abstract [en]

    This paper discovers rules-of-thumb on how the estimation precision for an incident source’sazimuth-polar direction-of-arrival depends on the number (L) of identical isotropic sensorsspaced uniformly on an open sphere of radius R. This estimation’s corresponding Cramer–Raobounds (CRB) are found to follow these elegantly simple approximations, useful for arraydesign: (i) For the azimuth arrival angle: ; and (ii) for the polar arrival angle: . Here, denotes the number of snapshots, refers to the incident signal’s wavelength,and symbolizes the signal-to-noise power ratio

  • 44.
    Nnonyelu, Chibuzo Joseph
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Jiang, Meng
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Lundgren, Jan
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-).
    Spherical-sector harmonics domain processing for wideband source localization using spherical-sector array of directional microphones2023In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 153, no 3_supplement, p. A54-A54Article in journal (Refereed)
    Abstract [en]

    The spherical microphone array can be uneconomical for applications where the sources arrive only from a known section of the sphere. For this reason, the spherical-sector harmonics was developed for processing spherical sector array. The orthonormal spherical sector harmonics (SSH) basis functions which accounts for the discontinuity arising from sectioning the sphere have been developed and shown to work for the array of omnidirectional microphones. In this work, the SSH basis functions are applied to far-field wideband sound source localization using spherical-sector array of first-order directional microphones (cardioid microphones). The array manifold interpolation method is used to produce the steered covariance matrix and the MUSIC algorithm applied for the direction of arrival estimation. The root-mean-square error performance of this spherical-sector array of the first-order cardioid microphones is compared against that of the omnidirectional microphones for different directions and signal-to-noise ratio.

  • 45.
    Nnonyelu, Chibuzo Joseph
    et al.
    Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong.
    Lee, Charles Hung
    Wong, Kainam Thomas
    Directional pointing error in “spatial matched filter” beamforming at a tri-Axial velocity-sensor with non-orthogonal axes2018In: Proceedings of Meetings on Acoustic, Acoustical Society of America , 2018, Vol. 33, p. 1-7, article id 055004Conference paper (Refereed)
    Abstract [en]

    The “tri-axial velocity-sensor” has three axes that are nominally perpendicular, but may be nonperpendicular in practice, due to real-world imperfections in manufacturing, deployment, or maintenance.This paper comprehensively investigates how such non-perpendicularity would affect the tri-axial velocitysensor’s azimuth-elevation beam-pattern in terms of the beam’s pointing direction. It was shown that thenon-perpendicular axes do not affect the overall shape of the beampattern, but would introduce a pointingbias.

  • 46.
    Nnonyelu, Chibuzo Joseph
    et al.
    University of Nigeria, Nsukka, Nigeria.
    MADUKWE, C.
    Federal University of Agriculture, Makurdi, Benue, Nigeria.
    MADUKWE, K.
    School of Engineering and Computer Science Victoria University of Wellington, New Zealand.
    Fully steerable collocated first-order cardioid microphone array acoustic beam-pattern2020In: Przeglad Elektrotechniczny, ISSN 0033-2097, E-ISSN 2449-9544, Vol. 96, no 12, p. 76-80Article in journal (Refereed)
    Abstract [en]

    In the paper, a data-independent beamforming method capable of fully steering the mainlobe of the first-order cardioid triad is presented.This proposed method exploits the multi-pattern operating mode of the dual-diaphragm microphones design of some first-order cardioid microphonesto extend the polar-azimuth space in to which the mainlobe of such cardioid triad to the full sphere. The beamwidth and directivity of the proposedcollocated array are derived and analyzed.

  • 47.
    Nnonyelu, Chibuzo Joseph
    et al.
    University of Nigeria, Nsukka, Nigeria.
    Morris, Zakayo Ndiku
    Muthaiga, Nairobi, Kenya.
    Madukwe, Chinaza Alice
    Federal University of Agriculture, Makurdi, Nigeria.
    On the Performance of L- and V-Shaped Arrays of Cardioid Microphones for Direction Finding2021In: IEEE Sensors Journal, ISSN 1530-437X, E-ISSN 1558-1748, Vol. 21, no 2, p. 2211-2218Article in journal (Refereed)
    Abstract [en]

    The L-shaped and V-shaped arrays of first-order cardioid microphones for direction finding are presented in this paper. A comparative study of the direction of arrival estimation performance of the arrays was carried out by analytically deriving, in closed-form, and comparing the Cramér-Rao bounds of an incident signal's direction-of-arrival (DoA) azimuth and polar angles for these arrays. The maximum-likelihood estimator is used to verify the correctness of derived bounds. This investigation reveals that for direction finding, the L-shaped array of cardioid microphones would generally outperform the V-shaped array of cardioid microphones in more sub-regions of the DoA polar-azimuth angle space. However, in regions where the V-shaped array of cardioid microphones outperforms the L-shaped array of cardioid microphone, the variance ratios are usually higher in favor of the V-shaped array.

  • 48. Nnonyelu, Chibuzo Joseph
    et al.
    Wong, Kainam Thomas
    Wu, Yue Ivan
    Cardioid microphones/hydrophones in a collocated and orthogonal triad—A steerable beamformer with no beam-pointing error2019In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, The Journal of the Acoustical Society of America, Vol. 145, no 1, p. 575-588Article in journal (Refereed)
    Abstract [en]

    Cardioid sensors offer low sidelobes/backlobes compared to figure-8 bi-directional sensors (like velocity-sensors). Three cardioid sensors, in orthogonal orientation and in spatial collocation, have recently been proposed by Wong, Nnonyelu, and Wu [(2018). IEEE Trans. Sign. Process. 66(4), 895–906] and such a cardioid-triad's “spatial matched filter” beam-pattern has been analyzed therein. That beam-pattern, unfortunately, suffers pointing error, i.e., the spatial beam's actual peak direction deviates from the nominal “look direction.” Instead, this paper will propose a steerable data-independent beamformer for the above-mentioned cardioidic triad to avoid beam-pointing error. Also analytically derived here (via multivariate calculus) is this beam-pattern's lobes' height ratio, beamwidth, directivity, and array gain.

  • 49.
    Pasha, Shahab
    et al.
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Lundgren, Jan
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Ritz, Christian
    Zou, Yuexian
    Distributed microphone arrays, emerging speech and audio signal processing platforms: A review2020In: Advances in Science, Technology and Engineering Systems, ISSN 2415-6698, Vol. 5, no 4, p. 331-343Article in journal (Refereed)
    Abstract [en]

    Given ubiquitous digital devices with recording capability, distributed microphone arrays are emerging recording tools for hands-free communications and spontaneous tele-conferencings. However, the analysis of signals recorded with diverse sampling rates, time delays, and qualities by distributed microphone arrays is not straightforward and entails important considerations. The crucial challenges include the unknown/changeable geometry of distributed arrays, asynchronous recording, sampling rate mismatch, and gain inconsistency. Researchers have recently proposed solutions to these problems for applications such as source localization and dereverberation, though there is less literature on real-time practical issues. This article reviews recent research on distributed signal processing techniques and applications. New applications benefitting from the wide coverage of distributed microphones are reviewed and their limitations are discussed. This survey does not cover partially or fully connected wireless acoustic sensor networks. 

  • 50.
    Qureshi, Kamran
    Mid Sweden University, Faculty of Science, Technology and Media, Department of Electronics Design.
    Pedestrian Detection on FPGA2014Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    Image processing emerges from the curiosity of human vision. To translate, what we see in everyday life and how we differentiate between objects, to robotic vision is a challenging and modern research topic. This thesis focuses on detecting a pedestrian within a standard format of an image. The efficiency of the algorithm is observed after its implementation in FPGA. The algorithm for pedestrian detection was developed using MATLAB as a base. To detect a pedestrian, a histogram of oriented gradient (HOG) of an image was computed. Study indicates that HOG is unique for different objects within an image. The HOG of a series of images was computed to train a binary classifier. A new image was then fed to the classifier in order to test its efficiency. Within the time frame of the thesis, the algorithm was partially translated to a hardware description using VHDL as a base descriptor. The proficiency of the hardware implementation was noted and the result exported to MATLAB for further processing. A hybrid model was created, in which the pre-processing steps were computed in FPGA and a classification performed in MATLAB. The outcome of the thesis shows that HOG is a very efficient and effective way to classify and differentiate different objects within an image. Given its efficiency, this algorithm may even be extended to video.

    Download full text (pdf)
    Pedestrian Detection on FPGA
12 1 - 50 of 68
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf