Jordan•5mo ago

Information about RealtimeAvatarVoice

Hey there, newbie to Normcore here, really excited to get going with this multiplayer package I've heard so much about. I'm making a multiplayer VR game, and I'd like to use the player's voice data to interact with OpenAI's realtime API for certain interactions. Looking at the RealtimeAvatarVoice page (Realtime API -> XR -> Voice Chat), no information is given about how to use the voiceData class member I can see in Visual Studio when searching through the components of RealtimeAvatarVoice. I see that voiceData is a float[], but any other info about it is a guess: the bit-depth, sample length, etc. I'm working towards converting it into something that OpenAI can parse but I can't do that if I don't have more information about it.

5 Replies

kapa.ai•5mo ago

Hi @Jordan. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave: ---------------------- Thank you for your question about Normcore's RealtimeAvatarVoice component. I understand you're looking for more detailed information about the voiceData member to integrate it with OpenAI's API. However, I need to clarify that the information you're referring to about voiceData is not present in the knowledge sources provided to me. The Realtime > Xr > Voice Chat documentation does mention a voiceVolume property, which is a read-only value between 0-1 that reflects the audio level the last time it was sampled for playback. This property can be useful for animating the mouth of an avatar to match a player's speech. However, the documentation doesn't mention a voiceData member or provide details about raw audio data access. It's possible that the voiceData member you're seeing in Visual Studio is part of an internal implementation that isn't documented in the public API. For accessing raw audio data, the documentation does mention an AudioPreprocessor that's included with RealtimeAvatarVoice. This preprocessor automatically sets the gain for the microphone and filters out background noise, reverb, and echo cancellation. However, it doesn't provide details on how to access the raw audio data. If you need access to raw audio data for integration with OpenAI's API, you might want to consider using Normcore's Media Streams API. According to the Room > Room > Media streams documentation, Normcore provides high-quality, low-latency audio streams via CreateAudioInputStream(). This API returns an AudioInputStream that can be used to send audio data to all clients. Given the limited information available in the provided knowledge sources about directly accessing raw voice data from RealtimeAvatarVoice, I would recommend reaching out to Normcore's support or community forums for more detailed information about accessing raw audio data or using the voiceData member you've discovered. They might be able to provide more specific guidance or confirm if there's a recommended way to access the raw audio data for your use case. ---------------------- Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: ----------------------

kapa.ai•5mo ago

<@817172607438815262> Kapa.ai is still learning and improving, please let me know how I did by reacting below.

maxweisel•5mo ago

great question. It’s mono and I believe always 48khz but I can’t remember if we call the voiceData function before or after resampling if it’s needed. @Scott will you take a look at this?

plonuu•5mo ago

Yeah, just double checked, and it will be 48khz mono PCM samples, each sample being a 32-bit float in the range [-1.0, 1.0], when using the native oculus microphone. The voiceData event will be triggered only when the buffer array is full with the next 48khz samples. The event may be invoked multiple times within a single frame, though this generally is unlikely unless the game is hitching (freezing, loading, etc.)

JordanOP•5mo ago

thanks for the info!

Information about RealtimeAvatarVoice

Did you find this page helpful?