A group of researchers has made a groundbreaking achievement in the realm of secure communications. They have developed an algorithm that conceals sensitive information so effectively that it becomes undetectable.
This breakthrough could soon have significant implications for digital human communications, including social media and private messaging. In particular, it may benefit vulnerable groups such as dissidents, investigative journalists, and humanitarian aid workers who require a means of sending perfectly secure information. The study has been published here.
The algorithm applies to steganography, which is the practice of hiding sensitive information inside innocuous content. Unlike cryptography, steganography hides sensitive information to obscure the fact that something has been hidden. An example of this would be hiding a Shakespeare poem inside an AI-generated image of a cat.
Although steganography has been studied for over 25 years, existing approaches generally have inadequate security, meaning individuals using these methods risk being detected. This is because previous steganography algorithms would subtly change the distribution of innocuous content.
However, the research team overcame this issue by using recent breakthroughs in information theory, specifically minimum entropy coupling, which enables two distributions of data to be joined together to maximize their mutual information, but the individual distributions are preserved.
As a result, the new algorithm ensures no statistical difference between the distribution of innocuous content and the distribution of content that encodes sensitive information.
The algorithm was tested using several models that produce auto-generated content, such as GPT-2, an open-source language model, and WAVE-RNN, a text-to-speech converter. The new algorithm demonstrated up to 40% higher encoding efficiency than previous steganography methods across various applications, enabling more information concealed within a given data. This makes steganography an attractive method, even if perfect security is not required, due to data compression and storage benefits.
The researchers have filed a patent for the algorithm but intend to issue it under a free license to third parties for non-commercial responsible use. This includes academic and humanitarian use and trusted third-party security audits.
The research team has published this work as a preprint paper on arXiv and has open-sourced an inefficient implementation of their method on GitHub.
In May, they will also present the new algorithm at the premier AI conference, the 2023 International Conference on Learning Representations.
AI-generated content is increasingly used in ordinary human communications, driven by products like ChatGPT, Snapchat AI-stickers, and TikTok video filters. As a result, steganography may become more widespread, as the mere presence of AI-generated content will cease to arouse suspicion.
“Our method can be applied to any software that automatically generates content, such as probabilistic video filters or meme generators. This could be very valuable, for instance, for journalists and aid workers in countries where the act of encryption is illegal. However, users still need to exercise caution as any encryption technique may be vulnerable to side-channel attacks such as detecting a steganography app on the user’s phone,” said co-lead author Dr. Christian Schroeder de Witt, from the Department of Engineering Science at the University of Oxford.
Co-lead author Samuel Sokota from the Machine Learning Department at Carnegie Mellon University explained that the main contribution of their work is demonstrating a deep connection between a problem called minimum entropy coupling and perfectly secure steganography. By leveraging this connection, they have introduced a new family of steganography algorithms with perfect security guarantees.
By leveraging this connection, they have introduced a new family of steganography algorithms with perfect security guarantees. Flappy Bird