A major security weakness in ChatGPT was discovered by a group of academics from prominent universities, including Google DeepMind, the University of Washington, Cornell, Carnegie Mellon University, the University of California Berkeley, and ETH Zurich. This discovery was unexpected. Due to a bug, the chatbot might reveal personal data, including phone numbers and email addresses, in response to what appeared to be a benign request.
The researchers stumbled upon this vulnerability by instructing ChatGPT to repeat random words indefinitely. To their astonishment, the chatbot complied but inadvertently disclosed sensitive data from its training set, including personal contact details, snippets from research papers, news articles, Wikipedia pages, and more.
The findings, detailed in a paper published on Tuesday, raised concerns about the security of large language models like ChatGPT. The researchers emphasized the necessity for both internal and external testing before deploying such models in real-world applications. They expressed surprise that the attack worked and underscored the importance of early detection.
ChatGPT and similar models, which power modern AI services like chatbots and image generators, rely on massive amounts of training data. However, the specific data sources for OpenAI’s chatbot have remained undisclosed due to the closed-source nature of the underlying language models.
The researchers noted that when prompted with specific words, ChatGPT revealed personally identifiable information (PII) at an alarming rate. For instance, instructing the chatbot to repeat the word “poem” resulted in the disclosure of a real founder and CEO’s email address and cellphone number. The vulnerability extended to various domains, including law firms and specific industries, demonstrating the breadth of the security lapse.
Although OpenAI promptly patched the vulnerability on August 30, the researchers revealed that their subsequent tests replicated some of the original findings. Engadget, in independent testing, was able to extract personal information from ChatGPT, highlighting the persistence of potential security risks.
This discovery underscores the ongoing challenges in securing AI models against unintended disclosure of sensitive information. As AI technologies continue to evolve, the incident serves as a stark reminder of the importance of rigorous testing and ongoing vigilance to safeguard user privacy and data integrity.