According to the researchers responsible for the Meta study, this “suggests that almost all the knowledge of large linguistic models is learned during pretraining, and that only a limited set of tuning data is necessary to teach models to produce high-quality output“.
Thus, the conclusion is that the RLHF technique does not provide as many improvements as it was supposed and, by doing without it or minimizing this method, the costs of training an AI-based language model can be reduced.
What has been achieved by the team led by Meta is outstanding, although the researchers recognize at least two deficiencies in LIMA. In principle, create data sets with high-quality examples it is a challenging approach and difficult to scale. Second, LIMA is not as robust as models that are already available as products, such as GPT-4. Although the Meta model generates good responses, a “prompt opponent” or “an unfortunate sample” in your training will lead you to give imprecise answers.
Even with these details, Yann LeCun, head of AI research at Meta, assured in a tweet that LIMA’s behavior shows that investing in the development of new and large LLMs will be important in the short term, but will lose validity in the medium term.
Meta’s approach to dominating the AI market
The development of LIMA is the result of a collaboration between Meta and researchers from Carnegie Mellon University, the University of Southern California and Tel Aviv University., a detail that explains much of the approach that Mark Zuckerberg’s company is taking to advance in the increasingly competitive AI market. The combination of Open Source proposals with a lighter and more flexible internal structure seems to be the strategy.
We believe that the entire AI community (academic researchers, civil society, policy makers and industry) should work together to develop clear guidelines on responsible AI in general and responsible broad language models in particular.
This community collaboration approach has been maintained in the company’s latest AI-related releases. For example, this week, Meta introduced Massively Multilingual Speech (MMS) models whose technology is capable of extending the capabilities of text-to-speech and speech-to-text tools to more 1,100 languages and allow them to identify more than 4,000 spoken languages.
Meta’s new MMS are open source models and the company invites researchers and programmers to build on their base to generate new applications and developments capable of considering the great linguistic heritage that the planet has.
The direction that the company took is at least interesting if we consider that it has significantly reduced its workforce and decided to stop hiring new personnel. today, Meta began its latest round of layoffs that will affect some 6,000 people. With this, the company has eliminated 27,000 jobs since November of last year when it had a base of 87,000 employees. Meta also closed about 5,000 positions that it had vacancies.
In the shadow of the large number of layoffs, embracing the AI industry from an open source perspective and with a smaller internal structure may be a strategy for Meta to discover the new and best talent that allows it to make this bet on a more profitable line of business than the Metaverse.