Unveiling the Diminishing Brilliance: The Case of GPT’s Intelligence Deterioration Over Time

By Aya Mohammed On Jul 23, 2023

A recent study conducted by researchers from Stanford and Berkeley universities revealed some surprising findings about the Large Language Model, GPT, which serves as the foundation for OpenAI’s ChatGPT chatbot, contrary to the claims made by the company officials, the study found that GPT has actually become less intelligent over time.

The researchers observed significant changes in the behavior of both GPT-3.5 and GPT-4 models over a few months, with a noticeable decrease in the accuracy of their responses, in March 2023, GPT-4 demonstrated an impressive 97.6% accuracy in identifying prime numbers, but by June 2023, its accuracy dropped dramatically to a mere 2.4%.

According to the researchers, both GPT-3.5 and GPT-4 showed more coding errors in June compared to their performance in March, these findings support user complaints about the noticeable deterioration in the latest versions of the model released in subsequent months.

Peter Welinder, the Vice President of Products at OpenAI, attempted to dispel rumors that the intentional downgrade was deliberate, he stated on Twitter, “No, we didn’t make GPT-4 dumber quite the opposite, we make each new version smarter than the previous one”, he suggested that continuous use of ChatGPT might lead users to notice previously unseen issues.

However, the research conducted by Stanford and Berkeley universities may prove more convincing in debunking this hypothesis, while the researchers did not pinpoint the reasons behind the decline in accuracy and capability, they noted that the evident deterioration challenges OpenAI’s assertion that their models continuously improve.

The paper concluded that both GPT-3.5 and GPT-4 exhibited significant differences in performance and behavior, and their capabilities in some tasks worsened over time, the researchers raised an intriguing question about whether GPT-4 has genuinely become more powerful.

In light of these findings, it is essential to understand the impact of model updates aimed at improving specific aspects that may unintentionally hinder its performance in other dimensions, the study serves as a valuable reminder of the complexities involved in the evolution of AI language models.