Read and write stem/multistream audio files
Python package to read and write STEM audio files. Technically, stems are audio containers that combine multiple audio streams and metadata in a single audio file. This makes it ideal to playback multitrack audio, where users can select the audio sub-stream during playback (e.g. supported by VLC).
Under the hood, stempeg uses ffmpeg for reading and writing multistream audio, optionally MP4Box is used to create STEM files that are compatible with Native Instruments hardware and software.
stempeg relies on ffmpeg (>= 3.2 is suggested).
The Installation if ffmpeg differ among operating systems. If you use anaconda you can install ffmpeg on Windows/Mac/Linux using the following command:
conda install -c conda-forge ffmpeg
Note that for better quality encoding it is recommended to install ffmpeg with libfdk-aac
codec support as following:
brew install ffmpeg --with-fdk-aac
docker pull jrottenberg/ffmpeg
If you plan to write stem files with full compatibility with Native Instruments Traktor DJ hardware and software, you need to install MP4Box.
brew install gpac
apt-get install gpac
Further installation instructions for all operating systems can be found here.
A) Installation via PyPI using pip
pip install stempeg
B) Installation via conda
conda install -c conda-forge stempeg
Stempeg can read multi-stream and single stream audio files, thus, it can replace your normal audio loaders for 1d or 2d (mono/stereo) arrays.
By default read_stems
, assumes that multiple substreams can exit (default reader=stempeg.StreamsReader()
).
To support multi-stream, even when the audio container doesn't support multiple streams
(e.g. WAV), streams can be mapped to multiple pairs of channels. In that
case, reader=stempeg.ChannelsReader()
, can be passed. Also see:
stempeg.ChannelsWriter
.
import stempeg
S, rate = stempeg.read_stems(stempeg.example_stem_path())
S
is a numpy tensor that includes the time domain signals scaled to [-1..1]
. The shape is (stems, samples, channels)
. An detailed documentation of the read_stems
can be viewed here. Note, a small stems excerpt from The Easton Ellises, licensed under Creative Commons CC BY-NC-SA 3.0 is included and can be accessed using stempeg.example_stem_path()
.
Individual substreams of the stem file can be read by passing the corresponding stem id (starting from 0):
S, rate = stempeg.read_stems(stempeg.example_stem_path(), stem_id=[0, 1])
Excerpts from the stem instead of the full file can be read by providing start (start
) and duration (duration
) in seconds to read_stems
:
S, _ = stempeg.read_stems(stempeg.example_stem_path(), start=1, duration=1.5)
# read from second 1.0 to second 2.5
As seen in the flow chart above, stempeg supports multiple ways to write multi-track audio.
stempeg.write_audio
can be used for single-stream, multi-channel audio files.
Stempeg wraps a number of ffmpeg parameter to resample the output sample rate and adjust the audio codec, if necessary.
stempeg.write_audio(path="out.mp4", data=S, sample_rate=44100.0, output_sample_rate=48000.0, codec='aac', bitrate=256000)
Writing stem files from a numpy tensor can done with.
stempeg.write_stems(path="output.stem.mp4", data=S, sample_rate=44100, writer=stempeg.StreamsWriter())
As seen in the flow chart above, stempeg supports multiple ways to write multi-stream audio. Each of the method has different number of parameters. To select a method one of the following setting and be passed:
stempeg.FilesWriter
Stems will be saved into multiple files. For the naming,
basename(path)
is ignored and just the
parent of path
and its extension
is used.stempeg.ChannelsWriter
Stems will be saved as multiple channels.stempeg.StreamsWriter
(default).
Stems will be saved into a single a multi-stream file.stempeg.NIStemsWriter
Stem will be saved into a single multistream audio.
Additionally Native Instruments Stems compabible
Metadata is added. This requires the installation of
MP4Box
.:warning: Warning: Muxing stems using ffmpeg leads to multi-stream files not compatible with Native Instrument Hardware or Software. Please use MP4Box if you use the
stempeg.NISTemsWriter()
For more information on writing stems, see stempeg.write_stems
.
An example that documents the advanced features of the writer, see readwrite.py.
stempeg provides a convenient cli tool to convert a stem to multiple wavfiles. The -s
switch sets the start, the -t
switch sets the duration.
stem2wav The Easton Ellises - Falcon 69.stem.mp4 -s 1.0 -t 2.5
read_stems
is called repeatedly, it always does two system calls, one for getting the file info and one for the actual reading speed this up you could provide the Info
object to read_stems
if the number of streams, the number of channels and the sample rate is identical.
file_path = stempeg.example_stem_path()
info = stempeg.Info(file_path)
S, _ = stempeg.read_stems(file_path, info=info)
For Encoding it is recommended to use the Fraunhofer AAC encoder (libfdk_aac
) which is not included in the default ffmpeg builds. Note that the conda version currently does not include fdk-aac
. If libfdk_aac
is not installed stempeg will use the default aac
codec which will result in slightly inferior audio quality.