Meta Releases 'Self-Taught Evaluator', Capable of Evaluating Other AI Models

(Photo : Facebook)
Meta

Meta has released a new AI model, the "Self-Taught Evaluator," capable of evaluating other AI models.
The model uses a "chain of thought" technique and was trained entirely on AI-generated data.
This development could lead to autonomous AI agents that learn from their own mistakes.
Meta's release of the "Self-Taught Evaluator" marks a significant step towards autonomous AI.

Meta, the tech giant formerly known as Facebook, has made a significant leap in the realm of artificial intelligence (AI). The company recently announced the release of a new AI model, the Self-Taught Evaluator, which is capable of evaluating the work of other AI models.

The model, announced on October 18, 2023, in New York, could potentially reduce the need for human involvement in AI development. The Self-Taught Evaluator was first introduced in a paper published by Meta in August, employing a chain of thought technique, similar to the one used in OpenAI's recently released o1 models.

This method involves breaking down complex problems into smaller, logical steps, thereby improving the accuracy of responses in challenging areas such as science, coding, and mathematics.

The model was trained entirely on AI-generated data, eliminating human input during that stage. This development hints at the potential of creating autonomous AI agents capable of learning from their own mistakes, according to two of Meta's researchers who spoke to Reuters.

Meta's Vision for Autonomous AI Agents

Such self-improving AI agents could one day operate as digital assistants, able to perform a wide range of tasks without human intervention. These models could also eliminate the current need for Reinforcement Learning from Human Feedback (RLHF), a process that relies on expensive, specialized human annotators to verify answers to complex queries.

We hope, as AI becomes more and more super-human, that it will get better and better at checking its work, so that it will actually be better than the average human, said Meta researcher Jason Weston. This vision of AI's future, where it surpasses human capabilities, is shared by many in the field.

However, Meta is not alone in its pursuit of advanced AI capabilities. Other companies, including Google and Anthropic, have also published research on the concept of RLHF, or Reinforcement Learning from AI Feedback. Unlike Meta, these companies tend not to release their models for public use.

Additional AI Tools from Meta

In addition to the Self-Taught Evaluator, Meta also released other AI tools. These include an update to the company's image-identification Segment Anything model, a tool that speeds up LLM response generation times, and datasets that can be used to aid the discovery of new inorganic materials.

These developments are reminiscent of historical events where technology companies have made significant strides in AI development. For instance, OpenAI, the maker of ChatGPT, has been working on a novel approach to its AI models in a project code-named Strawberry. This project aims to deliver advanced reasoning capabilities, a common goal shared with Meta's Self-Taught Evaluator.

Similarly, Google has been enhancing its Maps app with new AI features, including machine learning algorithms that improve its Search function and the AR-powered 'Lens in Maps' feature. These developments, like Meta's new AI model, aim to make technology more useful and autonomous, reducing the need for human intervention.

By training AI models to evaluate and learn from their own mistakes, Meta is paving the way for a future where digital assistants can perform a vast array of tasks without human intervention.