Android: sound API (deterministic, low latency)

I'm reviewing all kinds of Android sound API and I'd like to know which one I should use. My goal is to get low latency audio or, at least, deterministic behavior regarding delay of playback.

We've had a lot of problems and it seems that Android sound API is crap, so I'm exploring possibilities.

The problem we have is that there is significant delay between sound_out.write(sound_samples); and actual sound played from the speakers. Usually it is around 300 ms. The problem is that on all devices it's different; some don't have that problem, but most are crippled (however, CS call has zero latency). The biggest issue with this ridiculous delay is that on some devices this delay appears to be some random value (i.e. it's not always 300ms).

I'm reading about OpenSL ES and I'd like to know if anybody had experience with it, or it's the same shit but wrapped in different package?

I prefer to have native access, but I don't mind Java layer indirection as long as I can get deterministic behavior: either the delay has to be constant (for a given device), or I'd like to get access to current playback position instead of guessing it with a error range of ±300 ms...

EDIT:1.5 years later I tried multiple android phones to see how I can get best possible latency for a real time voice communication. Using specialized tools I measured the delay of waveout path. Best results were over 100 ms, most phones were in 180ms range. Anybody have ideas?

Answers


SoundPool is the lowest-latency interface on most devices, because the pool is stored in the audio process. All of the other audio paths require inter-process communication. OpenSL is the best choice if SoundPool doesn't meet your needs.

Why OpenSL? AudioTrack and OpenSL have similar latencies, with one important difference: AudioTrack buffer callbacks are serviced in Dalvik, while OpenSL callbacks are serviced in native threads. The current implementation of Dalvik isn't capable of servicing callbacks at extremely low latencies, because there is no way to suspend garbage collection during audio callbacks. This means that the minimum size for AudioTrack buffers has to be larger than the minimum size for OpenSL buffers to sustain glitch-free playback.

On most Android releases this difference between AudioTrack and OpenSL made no difference at all. But with Jellybean, Android now has a low-latency audio path. The actual latency is still device dependent, but it can be considerably lower than previously. For instance, http://code.google.com/p/music-synthesizer-for-android/ uses 384-frame buffers on the Galaxy Nexus for a total output latency of under 30ms. This requires the audio thread to service buffers approximately once every 8ms, which was not feasible on previous Android releases. It is still not feasible in a Dalvik thread.

This explanation assumes two things: first, that you are requesting the smallest possible buffers from OpenSL and doing your processing in the buffer callback rather than with a buffer queue. Second, that your device supports the low-latency path. On most current devices you will not see much difference between AudioTrack and OpenSL ES. But on devices that support Jellybean+ and low-latency audio, OpenSL ES will give you the lowest-latency path.


IIRC, OpenSL is passed through the same interface as AudioTrack, so at best it will match AudioTrack. (FWIW, I'm currently using OpenSL for "low latency" output)

The sad truth is there is no such thing as low latency audio on Android. There isn't even a proper way to flag and/or filter devices based on audio latency.

What interface you'll want to use to minimize latency is going to depend on what you are trying to do. If you want to have an audio stream you'll be looking at either OpenSL or AudioTrack.

If you want to trigger some static oneshots you may want to use SoundPool. For static oneshots SoundPool will have low latency as the samples are preloaded to the hardware. I think it's possible to preload oneshots using OpenSL as well, but I haven't tried.


The lowest latency you can get is from SoundPool. There's a limit on how big of a sound you can play that way, but if you're under the limit (1Mb, IIRC) it's pretty low latency. Even that's probably not 40ms, though.

But it is faster than what you can get by streaming, at least in my experience.

Caveat: You may see occasional crashes in SoundPool on Samsung devices. I'm have a theory that it only happens when you access the SoundPool from multiple threads, but I haven't verified this.

EDIT: OpenSL ES apparently has extremely HIGH latency on Kindle Fire, while SoundPool is much better, but the reverse may be true on other platforms.


About the problem of a deterministic/constant latency, here you can find an interesting article:

APPROACHES FOR CONSTANT AUDIO LATENCY ON ANDROID

The core of their investigations is: Because the Audio HAL, which is one of the deeper levels of the audiopath and responsible for the timing of the audio-callback-events, is vendor-implemented the relative latencies can vary, especially in cheap hardware. So they suggest two approaches to reduce the variance of the latency. One is to take care of the callback-timing by inserting audio in fixed intervals, the other one is to filter the callback times to estimate the time at which a constant latency callback should have occured by appliyng a smoothing filter. With this two approaches the could significantly reduce the variance of the latency.

It should also be mentioned, that there is a new native Android-Audio-API, AAudio.

AAudio API

It's available/stable from Android Oreo 8.1 (API Level 27). There is also a wrapper, which dynamically chooses between OpenSL ES and AAudio and is much easier to code the OpenSL ES. It's still in developer preview.

Oboe Audio Library


The best way to get low latency for native code on Android is to use Oboe.

https://github.com/google/oboe

Oboe wraps AAudio on newer devices. AAudio offers the lowest possible latency paths. If AAudio is not available then Oboe calls OpenSL ES. Oboe is much easier to use than OpenSL ES.

AAudio either calls through AudioTrack or through a new MMAP data path. AAudio makes it easier to get a FAST track because you can leave some parameters unspecified. AAudio will then choose the right parameters needed for a FAST track.


Need Your Help

.NET Does removing a parent class turn the children into "garbage"?

c# .net memory-leaks garbage-collection

Suppose you have a Collection<B>, and you're about to remove an item. Instance of B is referenced from an instance of A, and references an instance of C, as shown in the first picture:

using exec(); in PHP with variables

php arrays debian

I'm trying to execute a command on a local Debian server. The code is as following;