I’m building an iOS app in Expo (SDK 54, custom dev client → TestFlight) and I’ve spent a long time trying to get music ducking right. I’d love advice from anyone who’s solved this combo of constraints.
What I want (the “ideal” behavior):
-
The user plays music from another app (Spotify, Apple Music, podcasts, etc.) before/during the workout.
-
My app speaks coaching cues over the music using react-native-tts.
-
While a TTS cue is speaking, the background music should duck (drop to ~20–30% volume), then return to full volume when the cue finishes.
-
The user can also tap a “listening” button mid-workout to give voice feedback (“too hard”, “swap exercise”, etc.) via @react-native-voice/voice (speech-to-text).
-
App keeps playing in the background and on the lock screen with media controls (react-native-track-player).
My current setup:
-
react-native-track-player configured with iOS audio session category Playback and options MixWithOthers + AllowBluetoothA2DP. This is what runs persistently during the session.
-
For TTS ducking, I flip the session option to include DuckOthers for the duration of each utterance, then flip it back.
-
For voice input, my lib/voice-input.ts swaps the category to PlayAndRecord per startListening() call, then restoreAudioSession() swaps it back to Playback when listening ends.
-
I pre-warm mic + speech-recognition permissions at player mount so the iOS dialog doesn’t pop mid-workout.
What works:
-
TTS cues are clear and crisp.
-
Voice input works reliably and the listening waveform shows.
-
Music keeps playing in the background and on the lock screen.
-
No more choppy TTS or silent listening attempts.
What doesn’t work:
-
Music does not duck when TTS speaks. It keeps playing at full volume, so the cue gets buried.
-
Adding DuckOthers to the session options before each Tts.speak() and removing it after doesn’t seem to actually duck the other app’s audio on a real device (TestFlight build, iPhone). It works “on paper” — the option is set — but the other app’s volume doesn’t drop.
What I’ve already tried and rolled back:
-
Persistent PlayAndRecord category (instead of swapping per-listen). On TestFlight this produced choppy TTS, no audible music, and silent listening attempts. Rolled back to the per-listen swap pattern above.
-
expo-av Audio.setAudioModeAsync with interruptionModeIOS: DuckOthers. This conflicts with react-native-track-player’s session management — whichever one is set last wins, and they fight each other.
-
expo-audio for ducking on Android works fine. iOS is the holdout.
Questions:
-
Has anyone successfully ducked third-party music during react-native-tts cues on iOS while also using react-native-track-player for background playback?
-
Is there a known-good order/timing for setting DuckOthers so iOS actually applies it? (Set before speak()? Set on tts-start event? Set on the track-player session vs. directly via a native module?)
-
Is the right answer to drop react-native-tts entirely and instead play pre-generated TTS audio files through react-native-track-player so ducking is handled by the same session that owns playback?
-
Does AVAudioSession’s setActive(false, .notifyOthersOnDeactivation) need to be called between TTS cues to “release” ducking back to the music app?
-
Any TestFlight-specific gotchas? Things behave differently in Expo Go vs. dev client vs. TestFlight, and ducking is one of those.
Stack:
-
Expo SDK 54 (custom dev client, not Expo Go)
-
react-native-tts (native iOS/Android)
-
react-native-track-player (background audio + lock screen controls)
-
@react-native-voice/voice (speech-to-text)
-
expo-audio (for Android ducking)
-
iOS 17+, tested via TestFlight on physical iPhone
Any pointers — code samples, native session config, or “switch libraries to X” — appreciated. Happy to share more code from lib/voice-input.ts or my track-player setup if useful.
Greatly Appreciated!!