The amazing capabilities ChatGPT, the chatbot from startup OpenAI, has sparked new interest and investment in artificial intelligence. But late last week, the CEO of OpenAI warned that the research strategy that gave rise to the bot is exhausted. It’s unclear exactly where future breakthroughs will come from.
OpenAI has delivered a number of impressive advances in language-powered AI in recent years by taking existing machine learning algorithms and scaling them to a size never before imagined. GPT-4, the last of those projects, was probably trained using trillions of words of text and many thousands of powerful computer chips. The process cost more than 100 million dollars.
But the company’s chief executive, Sam Altman, says further progress won’t come from making bigger models. “I think we’re at the end of the era where these are going to be these, like, giant, giant models,” he told an audience at an event held at MIT late last week. “We will improve them in other ways.”
Altman’s statement suggests an unexpected twist in the race to develop and implement new AI algorithms. Since OpenAI launched ChatGPT in November, Microsoft has used the underlying technology to add a chatbot to its Bing search engine, and Google has launched a rival chatbot called Bard. Many people have been quick to experiment with using the new generation of chatbots to help with work or personal tasks.
Meanwhile, numerous well-funded start-ups, including anthropic, AI21, Adhereand Character.AI, are pouring huge resources into building bigger and bigger algorithms in an effort to catch up with OpenAI technology. The initial version of ChatGPT was based on a slightly improved version of GPT-3, but users can now also access a version powered by the more capable GPT-4.
Altman’s statement suggests that GPT-4 could be the latest breakthrough to emerge from OpenAI’s strategy of making models bigger and feeding them more data. He did not say what kinds of investigative strategies or techniques might take their place. In it document describing GPT-4, OpenAI says that its estimates suggest diminishing returns with increasing model size. Altman said there are also physical limits to how many data centers the company can build and how fast it can build them.
Nick Frosst, a Cohere co-founder who previously worked on AI at Google, says Altman’s sentiment that going bigger won’t work indefinitely rings true. He also believes that progress on transformers, the type of machine learning model at the heart of GPT-4 and its rivals, is beyond scale. “There are lots of ways to make transformers much better and more useful, and many of them don’t involve adding parameters to the model,” he says. Frosst says that new AI model designs or architectures and further tuning based on human feedback are promising directions that many researchers are already exploring.
Each version of OpenAI’s influential family of language algorithms consists of an artificial neural network, software inspired by the way neurons work together, that is trained to predict the words that should follow a given text string.