audiolib/Readme.md

# Audio Pipeline

This project will be a complete audio processing pipeline for microcontrollers.
The idea is to be able to run various demuxers, decoders and filters directly on a
microcontroller. Goals include:

- Decode and process audio on ESP32
- Use as little RAM as possible. Basic functionality should run on WROOM modules.
- If extra RAM is available (as in the WROVER modules) it should be possible to
  run quite sophisticated DSP pipelines.
- Only use fixed point and at most single precision `float` calculations as the
  ESP32 has no FPU for `double` precision and only one FPU for both CPU cores.
- The pipeline sources and sinks should be flexible, e.g. allow I2S data to run
  through a EQ filter and then emit the result as I2S or sink it to Bluetooth.

License will be mostly BSD if possible (some decoders may use GPL code, so if
you include those you'll be bound to the GPL).

## How to use

To build a pipeline you need at least two elements: A source and a sink, see
*Implemented elements* for documentation.

If you have to decode an audio format you will probably need a demuxer to parse
the file format or bitstream and a decoder to decode the compressed audio to a
sample buffer.

If you have external requirements like a fixed number of channels, volume or
sample rate constraints you will probably need one or more filter elements.

To build a pipeline you will need at least `audio.h` and the header files for
the elements you want to add to the pipeline. For a code example look at
`AudioLib/main.c`.

A small taste (this plays the file `test.mp3` on the default audio device,
you will need `libao` for the sink):

```c
#include "audio.h"
#include "audio_source_file.h"
#include "audio_demuxer_mp3.h"
#include "audio_decoder_mp3.h"
#include "audio_sink_libao.h"

int main(int argc, char **argv) {
    AudioPipeline *pipeline = audio_pipeline_assemble(
       audio_source_file("test.mp3", 512),
       audio_demuxer_mp3(),
       audio_decoder_mp3(),
       audio_sink_libao(),
       NULL
   );

    if (pipeline != NULL) {
        AudioPipelineStatus result = pipeline->start(pipeline);
        audio_pipeline_destroy(pipeline);
        return result == PipelineFinished ? 0 : 1;
    }
    return 1;
}
```

## Compiling

If you're on macOS, just use the Xcode project file to build all dependencies,
you will need to have `libao` installed via Homebrew.

If you're on a Unix/Linux you can use the provided `Makefile`s. This will create
one static library per dependency and one for the audio library in the sub-dir
`audio/.build` and it`s subdirs.

So currently you will get:
- `libaudio-test`: the test binary created from `AudioLib/main.c`
- `audio/.build/libaudio.a`: static library for the audio library
- `audio/.build/libmad/libmad.a`: MP3 decoder library (dependency, GPL license)
- `audio/.build/speexdsp/libspeexresampler.a`: Speex resampler library (dependency, MIT license)


## Implemented elements

Currently not a lot of elements are implemented, but these here are:

### Source: Test tone

- Include: `audio_source_testtone.h`
- Create: `audio_source_testtone(uint16_t sample_rate, uint8_t channels, uint8_t bits_per_sample);`
- What it does: Play a sequence of pings from 440 Hz to 1440 Hz and then stop

### Source: File

- Include: `audio_source_file.h`
- Create: `audio_source_file(char *filename, uint32_t block_size);`
- What it does: Loads a file from disk and emits chunks of that file to the next
  element (usually a demuxer)

### Demuxer: MP3

- Include: `audio_demuxer_mp3.h`
- Create: `audio_demuxer_mp3();`
- What it does: Finds sync marker in input data stream and assembles complete MP3 Frames to be emitted to decoder

### Decoder: MP3

- Include: `audio_decoder_mp3.h`
- Create: `audio_decoder_mp3();`
- What it does: Takes MP3 frames from demuxer and decodes them to 16 Bit audio samples
- Dependencies: modified `libmad` in `deps/mp3`

### Filter: Resample

- Include: `audio_filter_resample.h`
- Create: `audio_filter_resample(uint32_t output_sample_rate);`
- What it does: Resamples sample data from previous elements to match `output_sample_rate`
- Dependencies: `speexdsp` resampler implementation in `deps/resampler`

You can configure the quality of the resampler by editing the `define` in the header file.
Lowest setting is `0` (not recommended, sounds bad!), the default for microcontrollers
should be around `1`, for more powerful devices you may go up to `4` (which is what OPUS
uses internally).

All values above `4` increase the quality in theory but the changes are not really audible.

### Filter: Parametric EQ

- Include: `audio_filter_param_eq.h`
- Create: `audio_filter_param_eq(EQBand *bands, int numBands, float prescale);`
- What it does: Applies a list of filters (bands) to the sample data

You can select which implementation should run in this filter by editing the header file:

- Define `SECOND_ORDER_FILTER` to create a filter with two chained biquad stages (faster
  rolloff, but needs twice the processing time)
- Define `FIXEDPOINT` if you do not have a FPU, but be aware the filter is unstable with
  bands below 300 Hz in this case (lower shelf filter is ok though)!
- Define `LOW_PRECISION` to use `float` processing instead of `double`. Mutually exclusive
  with fixed point.

Example:

```c
#include "audio.h"
#include "audio_source_testtone.h"
#include "audio_filter_param_eq.h"
#include "audio_sink_libao.h"

void main(void) {
    AudioPipeline *pipeline;
    EQBand bands[] = { /* Bath tub curve */
        {
            .type = EQTypeLowShelf,
            .frequency = 150.0,
            .Q = 1.41,
            .gain = 1.5,
        },
        {
            .type = EQTypePeak,
            .frequency = 4000.0,
            .Q = 1.41,
            .gain = 0.0
        },
        {
            .type = EQTypePeak,
            .frequency = 10000.0,
            .Q = 2.82,
            .gain = 1.0
        }
    };

    pipeline = audio_pipeline_assemble(
       audio_source_testtone(48000, 2, 16),
       audio_filter_param_eq(bands, 3, -2.5),
       audio_sink_libao(),
       NULL
   );

    ...
}
```

### Sink: File

- Include: `audio_sink_file.h`
- Create: `audio_sink_file(char *filename);`
- What it does: Writes whatever data it gets to a file, primarily used for debugging.

### Sink: libao

- Include: `audio_sink_libao.h`
- Create: `*audio_sink_libao();`
- What it does: Opens the default audio device of your computer and dumps the input
  samples to that device.
- Dependencies: locally installed `libao` (on a Mac Homebrew is fine, linuxes should have
  it in their package managers)

## TODO

A list of filters, demuxers and decoders that are planned to be implemented in the future:

- Demuxer: ISO MPEG 4 Containers (mp4/mov)
- Demuxer: OGG (for Vorbis and Opus)
- Decoder: AAC
- Decoder: FLAC
- Decoder: OPUS
- Decoder: Vorbis
- Decoder: WAV (PCM)
- Decoder: ALAC (Apple lossless, used by Airplay)
- Decoder: QOA (Quite Okay Audio)
- Source: I2S
- Source: HTTP/S
- Source: Icecast
- Source: Bluetooth A2DP
- Source: RTP
- Source: VBAN
- Sink: I2S
- Sink: Bluetooth A2DP
- Sink: SD-Card
- Sink: RTP
- Sink: VBAN
- Encoder: WAV (PCM)
- Encoder: QOA (Quite Okay Audio)
- Filter: Channel Mixer
- Filter: Volume
- Filter: Bankstown (Pseudo Bass enhancement for small speakers)