No description
Find a file
2024-03-06 22:21:23 +01:00
audio Update README to include ADTS and AAC elements 2024-03-06 22:18:21 +01:00
AudioLib Add equalizer to test application 2024-03-01 01:31:35 +01:00
AudioLib.xcodeproj Re-organize 2024-03-06 21:55:32 +01:00
.gitignore Remove build dirs from git 2024-03-04 01:22:27 +01:00
Makefile Update makefiles 2024-03-06 22:14:09 +01:00
Readme.md Missing library mention in Readme 2024-03-06 22:21:23 +01:00
shell.nix Add build system for linux 2024-03-04 01:23:09 +01:00

Audio Pipeline

This project will be a complete audio processing pipeline for microcontrollers. The idea is to be able to run various demuxers, decoders and filters directly on a microcontroller. Goals include:

  • Decode and process audio on ESP32
  • Use as little RAM as possible. Basic functionality should run on WROOM modules.
  • If extra RAM is available (as in the WROVER modules) it should be possible to run quite sophisticated DSP pipelines.
  • Only use fixed point and at most single precision float calculations as the ESP32 has no FPU for double precision and only one FPU for both CPU cores.
  • The pipeline sources and sinks should be flexible, e.g. allow I2S data to run through a EQ filter and then emit the result as I2S or sink it to Bluetooth.

License will be mostly BSD if possible (some decoders may use GPL code, so if you include those you'll be bound to the GPL).

How to use

To build a pipeline you need at least two elements: A source and a sink, see Implemented elements for documentation.

If you have to decode an audio format you will probably need a demuxer to parse the file format or bitstream and a decoder to decode the compressed audio to a sample buffer.

If you have external requirements like a fixed number of channels, volume or sample rate constraints you will probably need one or more filter elements.

To build a pipeline you will need at least audio.h and the header files for the elements you want to add to the pipeline. For a code example look at AudioLib/main.c.

A small taste (this plays the file test.mp3 on the default audio device, you will need libao for the sink):

#include "audio.h"
#include "audio_source_file.h"
#include "audio_demuxer_mp3.h"
#include "audio_decoder_mp3.h"
#include "audio_sink_libao.h"

int main(int argc, char **argv) {
    AudioPipeline *pipeline = audio_pipeline_assemble(
       audio_source_file("test.mp3", 512),
       audio_demuxer_mp3(),
       audio_decoder_mp3(),
       audio_sink_libao(),
       NULL
   );

    if (pipeline != NULL) {
        AudioPipelineStatus result = pipeline->start(pipeline);
        audio_pipeline_destroy(pipeline);
        return result == PipelineFinished ? 0 : 1;
    }
    return 1;
}

Compiling

If you're on macOS, just use the Xcode project file to build all dependencies, you will need to have libao installed via Homebrew.

If you're on a Unix/Linux you can use the provided Makefiles. This will create one static library per dependency and one for the audio library in the sub-dir audio/.build and it`s subdirs.

So currently you will get:

  • libaudio-test: the test binary created from AudioLib/main.c
  • audio/.build/libaudio.a: static library for the audio library
  • audio/.build/libmad/libmad.a: MP3 decoder library (dependency, GPL license)
  • audio/.build/speexdsp/libspeexresampler.a: Speex resampler library (dependency, MIT license)
  • audio/.build/libfaad2/libfaad2.a: FAAD2 (HE-)AAC decoder (dependency, GPL license)

Implemented elements

Currently not a lot of elements are implemented, but these here are:

Source: Test tone

  • Include: audio_source_testtone.h
  • Create: audio_source_testtone(uint16_t sample_rate, uint8_t channels, uint8_t bits_per_sample);
  • What it does: Play a sequence of pings from 440 Hz to 1440 Hz and then stop

Source: File

  • Include: audio_source_file.h
  • Create: audio_source_file(char *filename, uint32_t block_size);
  • What it does: Loads a file from disk and emits chunks of that file to the next element (usually a demuxer)

Demuxer: MP3

  • Include: audio_demuxer_mp3.h
  • Create: audio_demuxer_mp3();
  • What it does: Finds sync marker in input data stream and assembles complete MP3 Frames to be emitted to decoder

Decoder: MP3

  • Include: audio_decoder_mp3.h
  • Create: audio_decoder_mp3();
  • What it does: Takes MP3 frames from demuxer and decodes them to 16 Bit audio samples
  • Dependencies: modified libmad in deps/mp3

Demuxer: ADTS

  • Include: audio_demuxer_adts.h
  • Create: audio_demuxer_adts();
  • What it does: Finds sync marker in input data stream and assembles complete AAC Frames to be emitted to decoder

Decoder: AAC

  • Include: audio_decoder_aac.h
  • Create: audio_decoder_aac();
  • What it does: Takes AAC frames from demuxer and decodes them to 16 Bit audio samples
  • Dependencies: modified libfaad2 in deps/aac

You can configure the AAC decoder and disable for example HE-AAC (called SBR there) by editing config.h in deps/aac.

Filter: Resample

  • Include: audio_filter_resample.h
  • Create: audio_filter_resample(uint32_t output_sample_rate);
  • What it does: Resamples sample data from previous elements to match output_sample_rate
  • Dependencies: speexdsp resampler implementation in deps/resampler

You can configure the quality of the resampler by editing the define in the header file. Lowest setting is 0 (not recommended, sounds bad!), the default for microcontrollers should be around 1, for more powerful devices you may go up to 4 (which is what OPUS uses internally).

All values above 4 increase the quality in theory but the changes are not really audible.

Filter: Parametric EQ

  • Include: audio_filter_param_eq.h
  • Create: audio_filter_param_eq(EQBand *bands, int numBands, float prescale);
  • What it does: Applies a list of filters (bands) to the sample data

You can select which implementation should run in this filter by editing the header file:

  • Define SECOND_ORDER_FILTER to create a filter with two chained biquad stages (faster rolloff, but needs twice the processing time)
  • Define FIXEDPOINT if you do not have a FPU, but be aware the filter is unstable with bands below 300 Hz in this case (lower shelf filter is ok though)!
  • Define LOW_PRECISION to use float processing instead of double. Mutually exclusive with fixed point.

Example:

#include "audio.h"
#include "audio_source_testtone.h"
#include "audio_filter_param_eq.h"
#include "audio_sink_libao.h"

void main(void) {
    AudioPipeline *pipeline;
    EQBand bands[] = { /* Bath tub curve */
        {
            .type = EQTypeLowShelf,
            .frequency = 150.0,
            .Q = 1.41,
            .gain = 1.5,
        },
        {
            .type = EQTypePeak,
            .frequency = 4000.0,
            .Q = 1.41,
            .gain = 0.0
        },
        {
            .type = EQTypePeak,
            .frequency = 10000.0,
            .Q = 2.82,
            .gain = 1.0
        }
    };

    pipeline = audio_pipeline_assemble(
       audio_source_testtone(48000, 2, 16),
       audio_filter_param_eq(bands, 3, -2.5),
       audio_sink_libao(),
       NULL
   );

    ...
}

Sink: File

  • Include: audio_sink_file.h
  • Create: audio_sink_file(char *filename);
  • What it does: Writes whatever data it gets to a file, primarily used for debugging.

Sink: libao

  • Include: audio_sink_libao.h
  • Create: *audio_sink_libao();
  • What it does: Opens the default audio device of your computer and dumps the input samples to that device.
  • Dependencies: locally installed libao (on a Mac Homebrew is fine, linuxes should have it in their package managers)

TODO

A list of filters, demuxers and decoders that are planned to be implemented in the future:

  • Demuxer: ISO MPEG 4 Containers (mp4/mov)
  • Demuxer: OGG (for Vorbis and Opus)
  • Decoder: FLAC
  • Decoder: OPUS
  • Decoder: Vorbis
  • Decoder: WAV (PCM)
  • Decoder: ALAC (Apple lossless, used by Airplay)
  • Decoder: QOA (Quite Okay Audio)
  • Source: I2S
  • Source: HTTP/S
  • Source: Icecast
  • Source: Bluetooth A2DP
  • Source: RTP
  • Source: VBAN
  • Sink: I2S
  • Sink: Bluetooth A2DP
  • Sink: SD-Card
  • Sink: RTP
  • Sink: VBAN
  • Encoder: WAV (PCM)
  • Encoder: QOA (Quite Okay Audio)
  • Filter: Channel Mixer
  • Filter: Volume
  • Filter: Bankstown (Pseudo Bass enhancement for small speakers)