Mid Sweden University

miun.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Segmentation-based Initialization for Steered Mixture of Experts
Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-). Technical University of Berlin, Germany. (Realistic3D)
Mid Sweden University, Faculty of Science, Technology and Media, Department of Computer and Electrical Engineering (2023-). (Realistic3D)
2023 (English)In: 2023 IEEE International Conference on Visual Communications and Image Processing (VCIP), IEEE conference proceedings, 2023Conference paper, Published paper (Refereed)
Abstract [en]

The Steered-Mixture-of-Experts (SMoE) model is an edge-Aware kernel representation that has successfully been explored for the compression of images, video, and higher-dimensional data such as light fields. The present work aims to leverage the potential for enhanced compression gains through efficient kernel reduction. We propose a fast segmentation-based strategy to identify a sufficient number of kernels for representing an image and giving initial kernel parametrization. The strategy implies both reduced memory footprint and reduced computational complexity for the subsequent parameter optimization, resulting in an overall faster processing time. Fewer kernels, when combined with the inherent sparsity of the SMoEs, further enhance the overall compression performance. Empirical evaluations demonstrate a gain of 0.3-1.0 dB in PSNR for a constant number of kernels, and the use of 23 % less kernels and 25 % less time for constant PSNR. The results highlight the feasibility and practicality of the approach, positioning it as a valuable solution for various image-related applications, including image compression. 

Place, publisher, year, edition, pages
IEEE conference proceedings, 2023.
Keywords [en]
compression, gating network, segmentation, Computer vision, Image segmentation, Compression of images, Edge aware, High dimensional data, Kernel representation, Light fields, Mixture of experts, Mixture-of-experts model, Image compression
National Category
Computer Engineering
Identifiers
URN: urn:nbn:se:miun:diva-50594DOI: 10.1109/VCIP59821.2023.10402643Scopus ID: 2-s2.0-85184853593ISBN: 9798350359855 (print)OAI: oai:DiVA.org:miun-50594DiVA, id: diva2:1839248
Conference
2023 IEEE International Conference on Visual Communications and Image Processing, VCIP 2023
Available from: 2024-02-20 Created: 2024-02-20 Last updated: 2026-04-02Bibliographically approved
In thesis
1. Steered Mixture-of-Experts for Compactand Edge-aware Representation: From 2D Image Regression to 3D Radiance Fields
Open this publication in new window or tab >>Steered Mixture-of-Experts for Compactand Edge-aware Representation: From 2D Image Regression to 3D Radiance Fields
2026 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

As visual computing advances across domains such as image editing, autonomous driving, and digital twins, the need for high-fidelity yet computationally efficient representations has become increasingly critical. Traditional 2D models are constrained by fixed grids, limiting their adaptability and compactness, while emerging 3D techniques often deliver realism at the cost of excessive training time, memory usage, and energy consumption. This thesis tackles a central challenge across both 2D and 3D domains: how to construct scalable, high-quality visual representations without succumbing to inefficiency.

We examine Steered Mixture-of-Experts (SMoE)—a modular, kernel-based architecture that promises localized modeling and interpretability. Yet despite its expressive power, SMoE has historically suffered from impractical training regimes, bloated parameter counts, and poor support for high-dimensional data. This work pursues a cohesive answer to three research questions, aimed at making SMoE fast, compact, and capable of handling 3D visual content.

First, we confront the long-ignored problem of initialization. Through a segmentation-based method that aligns expert kernels with semantic image regions, we drastically reduce redundancy and training duration, producing models that are both compact and structurally aligned with the data. Second, we tackle the inefficiency of gradient-based optimization by introducing a rasterized training scheme, adapted from Gaussian splatting techniques in 3D rendering. By partitioning images into blocks and activating only relevant kernels during each optimization step, we reduce the computational footprint by an order of magnitude without sacrificing accuracy. Third, we generalize SMoE to 3D by reparameterizing its spatial kernels and integrating splatting-based differentiable rendering. This extension maintains the compactness of SMoE while supporting high-quality scene reconstruction, even under sparse supervision.

Experimental results confirm that our methods outperform baseline SMoE implementations in both speed and reconstruction quality across 2D and 3D tasks, and they further surpass existing 3DGS and related Gaussian-based approaches. Moreover, our approach enables previously infeasible applications—real-time training, compact deployment, and scalable modeling of complex scenes.

This thesis transforms SMoE from a theoretically elegant yet impractical construct into a viable backbone for efficient, high-fidelity visual data representation. By grounding mixture models in perceptual structure and exploiting block-level sparsity, we chart a broader design principle for structure-aware, rasterization-friendly learning systems.

Place, publisher, year, edition, pages
Berlin: Technische Universität Berlin, 2026. p. 121
Series
Mid Sweden University doctoral thesis, ISSN 1652-893X ; 443
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:miun:diva-56611 (URN)
Public defence
2026-02-05, Berlin, 00:00
Supervisors
Note

The thesis is part of a double PhD degree with Technische Universität Berlin and Mid Sweden University, published at TU Berlin.

At the time of the doctoral defence the following paper was unpublished: paper 3 and 5 (manuscript).

Available from: 2026-02-13 Created: 2026-02-12 Last updated: 2026-02-13Bibliographically approved

Open Access in DiVA

fulltext(3386 kB)23 downloads
File information
File name FULLTEXT01.pdfFile size 3386 kBChecksum SHA-512
4ab967126f50a2302e1583014b0437418f575a147df425742129fafd27508a1b76521e86e389a717ce57ba64ae4f5b203b855a711ba34d4d66c1f0d2ece0cb7c
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Li, Yi-HsinSjöström, Mårten

Search in DiVA

By author/editor
Li, Yi-HsinSjöström, Mårten
By organisation
Department of Computer and Electrical Engineering (2023-)
Computer Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 23 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 167 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf