I’m new to Defold and I’d like to contribute to the engine, especially to its audio implementation on Android and iOS. But I have a hard time figuring out how audio is implemented in Defold.
First of all: it seems that Defold uses multiple versions of OpenSL ES (in device_opensl.cpp and in opensl.c), and I don’t find a CoreAudio device for iOS except the openAL Alc backend coreaudio.c).
In sound.cpp I further find that the SoundSystem runs on its own thread and calls UpdateInternal to update the mixer. In sound.cpp I also find that the SoundSystem can be updated from the main thread via the Update function.
Coming from an audio programming background I am somewhat surprised since the audio systems on mobile run on a high-prio thread and provide the data in via a callback. I would therefore expect that the SoundSystem is updated via this callback. Something like
static void sound_mix(float* buffer, int samples)
{
for (int i = 0; i < SOUNDS; i++)
{
if (sounds[i].state == STATE_PLAYING)
{
mix(buffer, &(sounds[i].source.buffer), samples);
}
}
static void audio_callback(float* input, float* output, int samples)
{
// from microphone
input_filter(input, buffer, samples);
// mix with active sounds
sound_mix(buffer, samples);
// post processing
output_filter(buffer, samples);
// to speakers
buffer_to_output(buffer, output, samples);
}
Could someone elaborate on this a bit? Thanks in advance!
We run the sound update on a thread on those platforms that support it. Html5 for instance is not threaded.
Both implementations use dmSound::UpdateInternal.
In Defold, we mix audio ourselves, and only use the native (e.g. OpenAL/OpenSL) api for queueing the sound buffers.
Each sound update we ask the native sound system if there are any free buffers.
If there are, we will mix some new audio, and put it in the free buffers.
E.g. for OpenAL there are 6 buffers, with 768 frames each.
This ofc means that we play audio a bit more “in advance” than is perhaps always desired.
Especially if we want to add support for runtime effects.
Regarding contributing to Defold’s audio system:
I’ve been playing with the idea of adding a callback as you suggest to our Defold SDK (similar to your example).
Having such a callback, it would be possible to auto generate sounds and effects from a a Native Extension.
However, since I haven’t programmed a lot of sound for games, I’m not entirely sure of the requirements needed from such an API.
So on iOS and Android Defold uses the OpenAL backend, which is coupled through device_openal.cpp. And what’s the purpose of the device in device_opensl.cpp?
Concerning Defold's audio system:
I am building a mobile game right now that uses the microphone to detect beeps and communicates with other devices through sound cues (you play in a group). I was building my own mobile game framework, but decided not to continue with that. But I did implement a flexible audio system. From experience I know that the audio system should support 4 modes:
The difference between AUDIO_MODE_RECORD_PLAYBACK and AUDIO_MODE_STREAM is that the latter copies the input to the output, while the former does not. With AUDIO_MODE_STREAM you could, for example, distort voices in real time. For this mode it is important to select the “fast audio path” (no input filtering). My audio implementation takes that into account with INPUT_MODE_FAST versus INPUT_MODE_DEFAULT. With that we should be pretty complete.
What I like about the Defold approach is that you keep it simple. In this line we could just supply a callback for the input and for sound creation/playing. Then it is up to the Native Extension to fill these buffers. In that way we don’t have to think about the order of processing, because that’s up to the implementer of the extension. But to be honest: I have no experience with defining a good API.
Oh and yeah, we could also rework the sampling rate to go with the device’s preference (especially important on Android).
What I like about the Defold approach is that you keep it simple
Thanks, that’s what we strive for. It makes maintaining easier, as well as adding new features. And also, if there’s an issue, is should (mostly) be the same code on all platforms, which is a great benefit.
Yes, a callback for the extension to be able to fill in audio is the main idea.
My concern is perhaps when it should be called.
As I mentioned, we fill up the 6 buffers as soon as they’re available.
But in order to do realtime effects, we want to reduce that latency to a minimum. For example, can we keep using our polling technique, without risking sound starvation? E.g. will it suffice to have just one or two buffers?
No worries, we’ll figure it out together!
The main thing is to look at the actual use cases, and try not to make everything too generic.
I guess figuring out the sound uses cases is where I lack experience. What types of sound effects does one wish to do in a typical game?
Ah, I have to get used to the build process. I will look into that!
Concerning the audio callback:
Isn’t the when determined by the native audio thread? When the native audio system is ready it will call the native callback to ask for the next frame buffer (although Android and iOS have a different way of doing this). It is then up to us to fill that frame buffer a.s.a.p. This means that our processing may take up to FRAME_SIZE / SAMPLE_RATE seconds before the next call. For example: on iOS we may set FRAME_SIZE = 512 (samples) and SAMPLE_RATE = 44100 so that we have 11.6 ms to fill/process the buffer. This amount of time is comparable with the time we have for updating and rendering the game frame.
To conclude: as far as I know there is not need for polling because the native audio thread takes care of that via its callback.
That depends on the implementation. Currently we update on our own sound thread, polling the state of the sound api.
Another approach would be to react on callbacks from the sound api (as you have done previously).
Your calculation suggests that for one sound frame, we’d need to keep at least two buffers in flight at any time to cover 16.67ms (given 60fps).
Perhaps changing to using callbacks from the sound api should be in your first tests?
Also, remember that we support more platforms than iOS/Android: macOS, Windows, Linux, HTML5, Switch. So we need to keep the apis consistent across them all.
I guess that this is true for the polling strategy. But with a callback approach we don’t need to worry about it. Then the native audio system pulls in a frame when it needs it. This process runs in parallel with the main thread.
Yep, I have to put my words into action
You are absolutely right! Let me sleep on it…
By the way: what is the best way to prepare a setup?
Happy New Year! It has been a while since I started this thread and in the meanwhile time seems to fly
Just a quick check to know whether it is possible to build the engine on my Macbook Pro: During the build I get a final exception that “32 bit hosts are not supported!”. Does this mean that the engine cannot be build on my machine? Or am I missing some setting? My guess is that my machine is too old…
To make this build run I had to make some quirky adjustments:
My Macbook (model late 2011, running macOS High Sierra) cannot be updated to Catalina and this means that Homebrew is not supported anymore. As a result some packages need to be build from source and some did not build (ccache).
I did some patching to install XCode 10.3 but it seems that Defold requires a later version. In order to build the project I renamed the installed SDKs to the expected version 12.1. This could have introduced some issues.
I too am interested in how one could extend the Audio API so that we may play or create generative sounds from code. Though my requirements are much simpler than @hakoptak.
The API docs on the website do not list the dmSound API in the SDK, so I was uncertain where to start looking.
What I am hoping to achieve is something as simple as the web audio api createBuffer.
Create a new buffer
Fill it’s channel data
Play the sound
Once this is in order, then adding a callback or a buffer type that can be filled / streamed would be the next goal to stream audio continuously.
Though, I would be happy with just rendering buffers and being able to play them from code at runtime.
Strongly considering contributing for hacktober fast, but I foresee this one feature taking most of the time this month. Haha.
Can we just expose some of these methods for creating and controlling sound playback from nativeExtensions?
That coupled with a SoundDataType of SOUND_DATA_TYPE_RAW or something similar where you do not need anything other than the raw float data for the audio.
Perhaps it might be a good first step.
You can try using it first by creating a copy of that header file and using it in your extension as a proof of concept.
That way we’ll know it works and we can put the code in the Defold SDK.
(We generally don’t want to put things in the SDK if it’s not going to be used)