Cloud gaming has seen explosive growth, but syncing audio and video across devices remains a challenge. Scientists from MIT and Microsoft Research have introduced Ekho, a solution using inaudible white noise sequences to synchronize streams, reducing delays to less than 10 milliseconds in most cases.
The heart of the interstream delay challenge in cloud gaming lies in a fundamental networking problem known as clock synchronization. Achieving perfect synchronization is unattainable due to network constraints. Traditional clock synchronization approaches employ ping-pong messaging, where a device sends a ping message to the server, which responds with a pong message. However, these methods often yield unreliable results due to network asymmetry, introducing significant delays.
Human perception of interstream delay becomes noticeable at around 10 milliseconds. Therefore, Ekho’s creators sought an alternative approach by examining game audio for synchronization. The microphone on the player’s controller records the room’s audio, including game audio from the screen, and sends it back to the server. Ekho addresses the challenge of background noise by adding identical sequences of ultra-low-volume white noise (pseudo noise) to the game audio streamed to the screen, using these segments for synchronization.
Before implementing Ekho, a user study confirmed that players couldn’t hear the pseudo noise in the game audio. Additionally, these noise sequences are resilient to compression, a crucial consideration as controller-transmitted audio is heavily compressed for faster data transfer.
Ekho’s architecture consists of two modules: Ekho-Estimator and Ekho-Compensator. The former adds pseudo-noise sequences to the game audio and listens for these markers in the recorded audio from the controller, enabling precise inter-stream delay calculation. The latter module either skips a few milliseconds of sound or inserts silence into the game audio from the server, synchronizing the streams.
Real cloud gaming sessions demonstrated Ekho’s superior performance compared to other synchronization methods, even under poor microphone quality or background noise interference. Ekho maintained an interstream delay of less than 10 milliseconds for nearly 87 percent of the time during streams, outperforming other methods that consistently exceeded 50 milliseconds.
The researchers are eager to explore Ekho’s performance in more complex scenarios, such as synchronizing five controllers with a single screen device. Additionally, future work may aim to extend Ekho’s range for synchronizing devices across larger spaces.
This unconventional use of inaudible white noise as a synchronization tool showcases the power of innovative thinking. Ekho’s potential extends beyond cloud gaming, offering improvements in user experiences across various multi-device streaming scenarios.