No description

Find a file

Johannes Schriewer f3dd850809 Missing library mention in Readme		2024-03-06 22:21:23 +01:00
audio	Update README to include ADTS and AAC elements	2024-03-06 22:18:21 +01:00
AudioLib	Add equalizer to test application	2024-03-01 01:31:35 +01:00
AudioLib.xcodeproj	Re-organize	2024-03-06 21:55:32 +01:00
.gitignore	Remove build dirs from git	2024-03-04 01:22:27 +01:00
Makefile	Update makefiles	2024-03-06 22:14:09 +01:00
Readme.md	Missing library mention in Readme	2024-03-06 22:21:23 +01:00
shell.nix	Add build system for linux	2024-03-04 01:23:09 +01:00

Readme.md

Audio Pipeline

This project will be a complete audio processing pipeline for microcontrollers. The idea is to be able to run various demuxers, decoders and filters directly on a microcontroller. Goals include:

Decode and process audio on ESP32
Use as little RAM as possible. Basic functionality should run on WROOM modules.
If extra RAM is available (as in the WROVER modules) it should be possible to run quite sophisticated DSP pipelines.
Only use fixed point and at most single precision float calculations as the ESP32 has no FPU for double precision and only one FPU for both CPU cores.
The pipeline sources and sinks should be flexible, e.g. allow I2S data to run through a EQ filter and then emit the result as I2S or sink it to Bluetooth.

License will be mostly BSD if possible (some decoders may use GPL code, so if you include those you'll be bound to the GPL).

How to use

To build a pipeline you need at least two elements: A source and a sink, see Implemented elements for documentation.

If you have to decode an audio format you will probably need a demuxer to parse the file format or bitstream and a decoder to decode the compressed audio to a sample buffer.

If you have external requirements like a fixed number of channels, volume or sample rate constraints you will probably need one or more filter elements.

To build a pipeline you will need at least audio.h and the header files for the elements you want to add to the pipeline. For a code example look at AudioLib/main.c.

A small taste (this plays the file test.mp3 on the default audio device, you will need libao for the sink):

#include "audio.h"
#include "audio_source_file.h"
#include "audio_demuxer_mp3.h"
#include "audio_decoder_mp3.h"
#include "audio_sink_libao.h"

int main(int argc, char **argv) {
    AudioPipeline *pipeline = audio_pipeline_assemble(
       audio_source_file("test.mp3", 512),
       audio_demuxer_mp3(),
       audio_decoder_mp3(),
       audio_sink_libao(),
       NULL
   );

    if (pipeline != NULL) {
        AudioPipelineStatus result = pipeline->start(pipeline);
        audio_pipeline_destroy(pipeline);
        return result == PipelineFinished ? 0 : 1;
    }
    return 1;
}

Compiling

If you're on macOS, just use the Xcode project file to build all dependencies, you will need to have libao installed via Homebrew.

If you're on a Unix/Linux you can use the provided Makefiles. This will create one static library per dependency and one for the audio library in the sub-dir audio/.build and it`s subdirs.

So currently you will get:

libaudio-test: the test binary created from AudioLib/main.c
audio/.build/libaudio.a: static library for the audio library
audio/.build/libmad/libmad.a: MP3 decoder library (dependency, GPL license)
audio/.build/speexdsp/libspeexresampler.a: Speex resampler library (dependency, MIT license)
audio/.build/libfaad2/libfaad2.a: FAAD2 (HE-)AAC decoder (dependency, GPL license)

Implemented elements

Currently not a lot of elements are implemented, but these here are:

Source: Test tone

Include: audio_source_testtone.h
Create: audio_source_testtone(uint16_t sample_rate, uint8_t channels, uint8_t bits_per_sample);
What it does: Play a sequence of pings from 440 Hz to 1440 Hz and then stop

Source: File

Include: audio_source_file.h
Create: audio_source_file(char *filename, uint32_t block_size);
What it does: Loads a file from disk and emits chunks of that file to the next element (usually a demuxer)

Demuxer: MP3

Include: audio_demuxer_mp3.h
Create: audio_demuxer_mp3();
What it does: Finds sync marker in input data stream and assembles complete MP3 Frames to be emitted to decoder

Decoder: MP3

Include: audio_decoder_mp3.h
Create: audio_decoder_mp3();
What it does: Takes MP3 frames from demuxer and decodes them to 16 Bit audio samples
Dependencies: modified libmad in deps/mp3

Demuxer: ADTS

Include: audio_demuxer_adts.h
Create: audio_demuxer_adts();
What it does: Finds sync marker in input data stream and assembles complete AAC Frames to be emitted to decoder

Decoder: AAC

Include: audio_decoder_aac.h
Create: audio_decoder_aac();
What it does: Takes AAC frames from demuxer and decodes them to 16 Bit audio samples
Dependencies: modified libfaad2 in deps/aac

You can configure the AAC decoder and disable for example HE-AAC (called SBR there) by editing config.h in deps/aac.

Filter: Resample

Include: audio_filter_resample.h
Create: audio_filter_resample(uint32_t output_sample_rate);
What it does: Resamples sample data from previous elements to match output_sample_rate
Dependencies: speexdsp resampler implementation in deps/resampler

You can configure the quality of the resampler by editing the define in the header file. Lowest setting is 0 (not recommended, sounds bad!), the default for microcontrollers should be around 1, for more powerful devices you may go up to 4 (which is what OPUS uses internally).

All values above 4 increase the quality in theory but the changes are not really audible.

Filter: Parametric EQ

Include: audio_filter_param_eq.h
Create: audio_filter_param_eq(EQBand *bands, int numBands, float prescale);
What it does: Applies a list of filters (bands) to the sample data

You can select which implementation should run in this filter by editing the header file:

Define SECOND_ORDER_FILTER to create a filter with two chained biquad stages (faster rolloff, but needs twice the processing time)
Define FIXEDPOINT if you do not have a FPU, but be aware the filter is unstable with bands below 300 Hz in this case (lower shelf filter is ok though)!
Define LOW_PRECISION to use float processing instead of double. Mutually exclusive with fixed point.

Example:

#include "audio.h"
#include "audio_source_testtone.h"
#include "audio_filter_param_eq.h"
#include "audio_sink_libao.h"

void main(void) {
    AudioPipeline *pipeline;
    EQBand bands[] = { /* Bath tub curve */
        {
            .type = EQTypeLowShelf,
            .frequency = 150.0,
            .Q = 1.41,
            .gain = 1.5,
        },
        {
            .type = EQTypePeak,
            .frequency = 4000.0,
            .Q = 1.41,
            .gain = 0.0
        },
        {
            .type = EQTypePeak,
            .frequency = 10000.0,
            .Q = 2.82,
            .gain = 1.0
        }
    };

    pipeline = audio_pipeline_assemble(
       audio_source_testtone(48000, 2, 16),
       audio_filter_param_eq(bands, 3, -2.5),
       audio_sink_libao(),
       NULL
   );

    ...
}

Sink: File

Include: audio_sink_file.h
Create: audio_sink_file(char *filename);
What it does: Writes whatever data it gets to a file, primarily used for debugging.

Sink: libao

Include: audio_sink_libao.h
Create: *audio_sink_libao();
What it does: Opens the default audio device of your computer and dumps the input samples to that device.
Dependencies: locally installed libao (on a Mac Homebrew is fine, linuxes should have it in their package managers)

TODO

A list of filters, demuxers and decoders that are planned to be implemented in the future:

Demuxer: ISO MPEG 4 Containers (mp4/mov)
Demuxer: OGG (for Vorbis and Opus)
Decoder: FLAC
Decoder: OPUS
Decoder: Vorbis
Decoder: WAV (PCM)
Decoder: ALAC (Apple lossless, used by Airplay)
Decoder: QOA (Quite Okay Audio)
Source: I2S
Source: HTTP/S
Source: Icecast
Source: Bluetooth A2DP
Source: RTP
Source: VBAN
Sink: I2S
Sink: Bluetooth A2DP
Sink: SD-Card
Sink: RTP
Sink: VBAN
Encoder: WAV (PCM)
Encoder: QOA (Quite Okay Audio)
Filter: Channel Mixer
Filter: Volume
Filter: Bankstown (Pseudo Bass enhancement for small speakers)