Recent research by watchdog NewsGuard has found that OpenAI’s latest version of its GPT large language model, GPT-4, is significantly less accurate than its predecessor, GPT-3. In fact, GPT-4 performed worse when it comes to spewing outright lies with confidence.
According to NewsGuard’s report, GPT-4 echoed false news narratives 100% of the time when prompted by the researchers. This is a step in the wrong direction from GPT-3.5, which only echoed 80 of the 100 conspiratorial news items when put to the same test. In other words, GPT-3.5 resisted the leading prompts on 20 of the 100 conspiratorial news items, while GPT-4 refused none of them.
This finding is alarming, as misinformation is a growing problem in our society, and AI models like GPT are expected to assist in filtering and verifying information. GPT-4’s performance is particularly worrying as it suggests that the model is more susceptible to being trained on false information and spewing it back out.
OpenAI’s own evaluation of GPT-4 shows that it is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5. However, NewsGuard’s findings suggest that the model is not as accurate as claimed by OpenAI.
Misinformation is a problem that affects everyone, and it is important for AI models to be accurate in order to help combat this issue. OpenAI needs to work on improving the accuracy of GPT-4 and ensure that it is not trained on false information. Additionally, the company needs to be transparent about its evaluation metrics and make sure that they are accurate and representative of the model’s performance.
In conclusion, the findings of NewsGuard’s research highlight the need for more accurate and reliable AI models that can help combat the spread of misinformation. While GPT-4 may not be up to the task yet, we hope that future versions will be more accurate and reliable in filtering and verifying information.