Home News OpenAI Launches New AI Tool to Catch ChatGPT Errors

OpenAI Launches New AI Tool to Catch ChatGPT Errors

28/06/2024

OpenAI, the artificial intelligence research lab behind ChatGPT, has announced a new tool called CriticGPT, designed to significantly enhance the accuracy of code reviews. This AI-powered tool has shown promising results in identifying errors in code generated by ChatGPT, potentially improving code review outcomes by a substantial 60%.

Training CriticGPT: A Collaborative Approach

CriticGPT, built on the foundation of GPT-4, underwent a unique training process. AI trainers were tasked with modifying ChatGPT-generated code and then providing feedback as if they had discovered a genuine bug. By comparing various critiques of the modified code, trainers could easily assess the effectiveness of the critiques in identifying the inserted errors. This approach enabled CriticGPT to identify both intentionally introduced bugs and those that had been previously detected by other trainers.

Integration with RLHF and Beyond

OpenAI plans to incorporate CriticGPT-like models into its Reinforcement Learning from Human Feedback (RLHF) labeling pipeline, which is a key component in the training of its AI models. This integration will provide AI trainers with valuable assistance, further improving the training and evaluation processes.

Addressing Limitations and Future Directions

While CriticGPT shows great promise, OpenAI acknowledges that, like other AI tools, it can sometimes “hallucinate” or produce inaccurate results. Particularly in cases of highly complex tasks or responses, the model’s evaluation may not be entirely accurate. OpenAI is actively working to address these limitations and improve the reliability of CriticGPT in the future.

The Potential Impact of CriticGPT

The introduction of CriticGPT marks a significant step forward in the development of AI tools for code review. By leveraging AI to identify errors and improve code quality, it has the potential to streamline the development process, reduce bugs, and enhance the overall reliability of software. While the technology is still under development, the initial results are encouraging and suggest that AI-assisted code review may become a standard practice in the near future.