A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems

Karn N. Watcharasupat; Alexander Lerch

doi:10.5281/zenodo.14877513

Published November 10, 2024 | Version v1

Conference paper Open

A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems

Despite significant recent progress across multiple sub-tasks of audio source separation, few music source separation systems support separation beyond the four-stem vocals, drums, bass, and other (VDBO) setup. Of the very few current systems that support source separation beyond this setup, most continue to rely on an inflexible decoder setup that can only support a fixed pre-defined set of stems. Increasing stem support in these inflexible systems correspondingly requires increasing computational complexity, rendering extensions of these systems computationally infeasible for long-tail instruments. We propose Banquet, a system that allows source separation of multiple stems using just one decoder. A bandsplit source separation model is extended to work in a query-based setup in tandem with a music instrument recognition PaSST model. On the MoisesDB dataset, Banquet — at only 24.9 M trainable parameters — performed on par with or better than the significantly more complex 6-stem Hybrid Transformer Demucs. The query-based setup allows for the separation of narrow instrument classes such as clean acoustic guitars, and can be successfully applied to the extraction of less common stems such as reeds and organs.

Files

000118.pdf

Files (447.2 kB)

Name	Size	Download all
000118.pdf md5:dc1bb4f8401ddef9a7f9187a54118d60	447.2 kB	Preview Download

143

Views

249

Downloads

Show more details

	All versions	This version
Views	143	143
Downloads	249	249
Data volume	124.8 MB	124.8 MB

More info on how stats are collected....

DOI

Resource type

Conference paper

Publisher

ISMIR

Imprint

Proceedings of the 25th International Society for Music Information Retrieval Conference, 1051-1059. San Francisco, California, USA and Online.

Conference

International Society for Music Information Retrieval Conference (ISMIR 2024) , San Francisco, California, USA and Online, November 10-14, 2024

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: February 16, 2025
Modified: February 16, 2025

A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems

Authors/Creators

Description

Files

000118.pdf

Files (447.2 kB)