Sending another AudioSource together with the voice over the server.
I've made an NPC that generates AI responses in voice. Only one player can speak to the AI NPC, but I do want other players to hear the responses. It's rather out of the box, but most simple solution would be to mix the NPC's audiosource in with the local player's voice audiosource before sending it to the Realtime Voice component. Before I try and do this, is this even possible?
13 Replies
Hi @Water. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
----------------------
Short answer: not directly via RealtimeAvatarVoice. It automatically creates one microphone input stream for the owner and one playback stream for remotes; it doesn’t expose a way to mix arbitrary AudioSources into that upstream mic stream (Voice Chat). If you need to send NPC audio so all clients hear it, use Normcore’s media stream API to create your own audio input stream on the client that “hosts” the NPC and broadcast it, while others subscribe to the corresponding output stream:
- Create an AudioInputStream on the host: CreateAudioInputStream(voice: true, sampleRate, channels).
- Send the NPC’s generated PCM frames via SendRawAudioData on that stream.
- On other clients, get the matching AudioOutputStream using the host clientID and streamID, then play it back (e.g., pipe data to an AudioSource). (Room media streams, AudioInputStream, AudioOutputStream, Client API).
Notes:
- RealtimeAvatarVoice is great for mic chat and supports spatialization via a standard Unity AudioSource on the receiver, but it’s not documented to accept mixed external audio as input (Voice Chat, Common questions: voice).
- If all you need is to mute/unmute or level visualization, RealtimeAvatarVoice has mute and voiceVolume, but that won’t help inject NPC audio into the uplink (Voice Chat). If you need code, I can outline how to capture your NPC’s PCM buffer and push it through CreateAudioInputStream/SendRawAudioData, and how to subscribe on listeners via GetAudioOutputStream. ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ----------------------
- If all you need is to mute/unmute or level visualization, RealtimeAvatarVoice has mute and voiceVolume, but that won’t help inject NPC audio into the uplink (Voice Chat). If you need code, I can outline how to capture your NPC’s PCM buffer and push it through CreateAudioInputStream/SendRawAudioData, and how to subscribe on listeners via GetAudioOutputStream. ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ----------------------
<@456917293310541830> Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@Normcore Assistant Yes, can you generate a simple example script?
Let me check the knowledge base for you again. One moment! :books:
----------------------
Certainly! Here’s a simple conceptual example, based strictly on the Normcore API documentation, for sending an NPC’s generated audio to all clients using the media stream API. This example assumes you have access to the raw PCM float[] audio data from your NPC’s voice generation.
On the host client (the one generating the NPC audio):
On the receiving clients:
Important notes:
- This is a conceptual example. The actual feeding of audio data from
AudioOutputStream.GetAudioData() into a Unity AudioSource requires a custom audio pipeline, which is outside the scope of the provided documentation.
- You must manage the distribution of clientID and streamID to all clients so they know which stream to subscribe to.
- The Normcore documentation does not provide a full end-to-end example of streaming arbitrary audio, but the above code follows the documented API for media streams (Room media streams, AudioInputStream, AudioOutputStream).
If you need more detail on the audio pipeline, you may need to consult Unity’s documentation for custom audio streaming, as the Normcore docs do not provide a full implementation.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------<@456917293310541830> Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@Normcore Assistant what would cause a "Could not read streamID" in this case, followed by "AudioInputStream: SendRawAudioData: Failed to send raw audio data buffer"?
Let me check the knowledge base for you again. One moment! :books:
----------------------
The error "Could not read streamID" followed by "AudioInputStream: SendRawAudioData: Failed to send raw audio data buffer" likely indicates that the AudioInputStream was not properly created or initialized before attempting to send audio data. According to the Normcore documentation, you must first successfully create an AudioInputStream (using
CreateAudioInputStream) before calling SendRawAudioData. If the stream is not valid (for example, if it was not created, was disposed, or failed to initialize), attempts to send data will fail because there is no valid streamID associated with the stream.
Possible causes include:
- The AudioInputStream was not created successfully, or the creation failed silently.
- The AudioInputStream was disposed or closed before sending data.
- The streamID was not properly set or retrieved after creation.
- There is a mismatch or timing issue where you try to send data before the stream is fully initialized.
The documentation for AudioInputStream and related methods emphasizes the need to ensure the stream is valid before sending data (AudioInputStream). If you are managing the stream's lifecycle manually, double-check that it is created and open before calling SendRawAudioData.
If the issue persists, reviewing the player logs for more detailed error messages is recommended, as suggested in the voice chat troubleshooting section (Voice chat common questions).
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------<@456917293310541830> Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@Water that sounds like a bug. Are you able to repro this reliably? I'd love to have our team take a look
Hey, thanks for reaching out!
I’ve been experimenting with streaming AI-generated speech into a Normcore session. The audio comes from a WebRTC track (via an external source, not a regular AudioClip or microphone). I was trying to get that audio to be heard by other clients through Normcore’s voice system.
I’ve cleaned up a lot of my test code since yesterday, so I can’t easily share the exact repro right now, but it seemed like Normcore’s voice streaming only works with microphone-style AudioSources, not with WebRTC-driven ones.
Hm well that's how I interpret it. I'm trying the AI again
@kapa.ai can Normcore’s audio streaming system handle programmatically generated audio (like AI TTS from an AudioSource), or is it limited to capturing and transmitting live microphone input only?
Let me check the knowledge base for you again. One moment! :books:
----------------------
Based on the available documentation, Normcore’s audio streaming system is not limited to live microphone input. The API allows you to create an
AudioInputStream and send arbitrary audio data using the SendRawAudioData(float[] audioData) method. This means you can stream programmatically generated audio, such as AI TTS output, as long as you can provide the audio data in the required format (PCM float array) and manage the stream lifecycle appropriately. The documentation does not specify any restriction that would limit audio streaming to only microphone input; it describes the use of AudioInputStream for sending audio data to all clients, regardless of the source (Room media streams, AudioInputStream API).
However, the documentation does not provide a full end-to-end example specifically for programmatically generated audio, so you may need to implement the audio data feeding and playback pipeline yourself. But fundamentally, the system supports streaming any audio data you can provide to the API.
----------------------
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
----------------------<@456917293310541830> Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@Water you can send raw audio by using the same API that RealtimeAvatarVoice uses to make an audio stream, and if you look at our AudioOutput class, it shows how to get audio out so you can play it back however you want (or pass it to your own webrtc stream)
but there's no WebRTC API I'm aware of to forward one WebRTC stream to another. even in something like libwebrtc