223 lines
7.1 KiB
Markdown
223 lines
7.1 KiB
Markdown
# Audio Pipeline
|
|
|
|
This project will be a complete audio processing pipeline for microcontrollers.
|
|
The idea is to be able to run various demuxers, decoders and filters directly on a
|
|
microcontroller. Goals include:
|
|
|
|
- Decode and process audio on ESP32
|
|
- Use as little RAM as possible. Basic functionality should run on WROOM modules.
|
|
- If extra RAM is available (as in the WROVER modules) it should be possible to
|
|
run quite sophisticated DSP pipelines.
|
|
- Only use fixed point and at most single precision `float` calculations as the
|
|
ESP32 has no FPU for `double` precision and only one FPU for both CPU cores.
|
|
- The pipeline sources and sinks should be flexible, e.g. allow I2S data to run
|
|
through a EQ filter and then emit the result as I2S or sink it to Bluetooth.
|
|
|
|
License will be mostly BSD if possible (some decoders may use GPL code, so if
|
|
you include those you'll be bound to the GPL).
|
|
|
|
## How to use
|
|
|
|
To build a pipeline you need at least two elements: A source and a sink, see
|
|
*Implemented elements* for documentation.
|
|
|
|
If you have to decode an audio format you will probably need a demuxer to parse
|
|
the file format or bitstream and a decoder to decode the compressed audio to a
|
|
sample buffer.
|
|
|
|
If you have external requirements like a fixed number of channels, volume or
|
|
sample rate constraints you will probably need one or more filter elements.
|
|
|
|
To build a pipeline you will need at least `audio.h` and the header files for
|
|
the elements you want to add to the pipeline. For a code example look at
|
|
`AudioLib/main.c`.
|
|
|
|
A small taste (this plays the file `test.mp3` on the default audio device,
|
|
you will need `libao` for the sink):
|
|
|
|
```c
|
|
#include "audio.h"
|
|
#include "audio_source_file.h"
|
|
#include "audio_demuxer_mp3.h"
|
|
#include "audio_decoder_mp3.h"
|
|
#include "audio_sink_libao.h"
|
|
|
|
int main(int argc, char **argv) {
|
|
AudioPipeline *pipeline = audio_pipeline_assemble(
|
|
audio_source_file("test.mp3", 512),
|
|
audio_demuxer_mp3(),
|
|
audio_decoder_mp3(),
|
|
audio_sink_libao(),
|
|
NULL
|
|
);
|
|
|
|
if (pipeline != NULL) {
|
|
AudioPipelineStatus result = pipeline->start(pipeline);
|
|
audio_pipeline_destroy(pipeline);
|
|
return result == PipelineFinished ? 0 : 1;
|
|
}
|
|
return 1;
|
|
}
|
|
```
|
|
|
|
## Compiling
|
|
|
|
If you're on macOS, just use the Xcode project file to build all dependencies,
|
|
you will need to have `libao` installed via Homebrew.
|
|
|
|
If you're on a Unix/Linux you can use the provided `Makefile`s. This will create
|
|
one static library per dependency and one for the audio library in the sub-dir
|
|
`audio/.build` and it`s subdirs.
|
|
|
|
So currently you will get:
|
|
- `libaudio-test`: the test binary created from `AudioLib/main.c`
|
|
- `audio/.build/libaudio.a`: static library for the audio library
|
|
- `audio/.build/libmad/libmad.a`: MP3 decoder library (dependency, GPL license)
|
|
- `audio/.build/speexdsp/libspeexresampler.a`: Speex resampler library (dependency, MIT license)
|
|
|
|
|
|
## Implemented elements
|
|
|
|
Currently not a lot of elements are implemented, but these here are:
|
|
|
|
### Source: Test tone
|
|
|
|
- Include: `audio_source_testtone.h`
|
|
- Create: `audio_source_testtone(uint16_t sample_rate, uint8_t channels, uint8_t bits_per_sample);`
|
|
- What it does: Play a sequence of pings from 440 Hz to 1440 Hz and then stop
|
|
|
|
### Source: File
|
|
|
|
- Include: `audio_source_file.h`
|
|
- Create: `audio_source_file(char *filename, uint32_t block_size);`
|
|
- What it does: Loads a file from disk and emits chunks of that file to the next
|
|
element (usually a demuxer)
|
|
|
|
### Demuxer: MP3
|
|
|
|
- Include: `audio_demuxer_mp3.h`
|
|
- Create: `audio_demuxer_mp3();`
|
|
- What it does: Finds sync marker in input data stream and assembles complete MP3 Frames to be emitted to decoder
|
|
|
|
### Decoder: MP3
|
|
|
|
- Include: `audio_decoder_mp3.h`
|
|
- Create: `audio_decoder_mp3();`
|
|
- What it does: Takes MP3 frames from demuxer and decodes them to 16 Bit audio samples
|
|
- Dependencies: modified `libmad` in `deps/mp3`
|
|
|
|
### Filter: Resample
|
|
|
|
- Include: `audio_filter_resample.h`
|
|
- Create: `audio_filter_resample(uint32_t output_sample_rate);`
|
|
- What it does: Resamples sample data from previous elements to match `output_sample_rate`
|
|
- Dependencies: `speexdsp` resampler implementation in `deps/resampler`
|
|
|
|
You can configure the quality of the resampler by editing the `define` in the header file.
|
|
Lowest setting is `0` (not recommended, sounds bad!), the default for microcontrollers
|
|
should be around `1`, for more powerful devices you may go up to `4` (which is what OPUS
|
|
uses internally).
|
|
|
|
All values above `4` increase the quality in theory but the changes are not really audible.
|
|
|
|
### Filter: Parametric EQ
|
|
|
|
- Include: `audio_filter_param_eq.h`
|
|
- Create: `audio_filter_param_eq(EQBand *bands, int numBands, float prescale);`
|
|
- What it does: Applies a list of filters (bands) to the sample data
|
|
|
|
You can select which implementation should run in this filter by editing the header file:
|
|
|
|
- Define `SECOND_ORDER_FILTER` to create a filter with two chained biquad stages (faster
|
|
rolloff, but needs twice the processing time)
|
|
- Define `FIXEDPOINT` if you do not have a FPU, but be aware the filter is unstable with
|
|
bands below 300 Hz in this case (lower shelf filter is ok though)!
|
|
- Define `LOW_PRECISION` to use `float` processing instead of `double`. Mutually exclusive
|
|
with fixed point.
|
|
|
|
Example:
|
|
|
|
```c
|
|
#include "audio.h"
|
|
#include "audio_source_testtone.h"
|
|
#include "audio_filter_param_eq.h"
|
|
#include "audio_sink_libao.h"
|
|
|
|
void main(void) {
|
|
AudioPipeline *pipeline;
|
|
EQBand bands[] = { /* Bath tub curve */
|
|
{
|
|
.type = EQTypeLowShelf,
|
|
.frequency = 150.0,
|
|
.Q = 1.41,
|
|
.gain = 1.5,
|
|
},
|
|
{
|
|
.type = EQTypePeak,
|
|
.frequency = 4000.0,
|
|
.Q = 1.41,
|
|
.gain = 0.0
|
|
},
|
|
{
|
|
.type = EQTypePeak,
|
|
.frequency = 10000.0,
|
|
.Q = 2.82,
|
|
.gain = 1.0
|
|
}
|
|
};
|
|
|
|
pipeline = audio_pipeline_assemble(
|
|
audio_source_testtone(48000, 2, 16),
|
|
audio_filter_param_eq(bands, 3, -2.5),
|
|
audio_sink_libao(),
|
|
NULL
|
|
);
|
|
|
|
...
|
|
}
|
|
```
|
|
|
|
### Sink: File
|
|
|
|
- Include: `audio_sink_file.h`
|
|
- Create: `audio_sink_file(char *filename);`
|
|
- What it does: Writes whatever data it gets to a file, primarily used for debugging.
|
|
|
|
### Sink: libao
|
|
|
|
- Include: `audio_sink_libao.h`
|
|
- Create: `*audio_sink_libao();`
|
|
- What it does: Opens the default audio device of your computer and dumps the input
|
|
samples to that device.
|
|
- Dependencies: locally installed `libao` (on a Mac Homebrew is fine, linuxes should have
|
|
it in their package managers)
|
|
|
|
## TODO
|
|
|
|
A list of filters, demuxers and decoders that are planned to be implemented in the future:
|
|
|
|
- Demuxer: ISO MPEG 4 Containers (mp4/mov)
|
|
- Demuxer: OGG (for Vorbis and Opus)
|
|
- Decoder: AAC
|
|
- Decoder: FLAC
|
|
- Decoder: OPUS
|
|
- Decoder: Vorbis
|
|
- Decoder: WAV (PCM)
|
|
- Decoder: ALAC (Apple lossless, used by Airplay)
|
|
- Decoder: QOA (Quite Okay Audio)
|
|
- Source: I2S
|
|
- Source: HTTP/S
|
|
- Source: Icecast
|
|
- Source: Bluetooth A2DP
|
|
- Source: RTP
|
|
- Source: VBAN
|
|
- Sink: I2S
|
|
- Sink: Bluetooth A2DP
|
|
- Sink: SD-Card
|
|
- Sink: RTP
|
|
- Sink: VBAN
|
|
- Encoder: WAV (PCM)
|
|
- Encoder: QOA (Quite Okay Audio)
|
|
- Filter: Channel Mixer
|
|
- Filter: Volume
|
|
- Filter: Bankstown (Pseudo Bass enhancement for small speakers)
|