In case you missed it: OpenAI’s GPT-3 (Generative Pre-Trained Transformer-3) has taken the Internet by storm.
In May 2020, researchers at San Francisco-based lab OpenAI published a paper on GPT-3. By mid-July 2020, GPT-3 had been distributed in closed private beta, which resulted in thousands of tweets by excited users talking about the model’s capabilities.
According to WIRED, “The software’s viral moment is an experiment in what happens when new artificial intelligence research is packaged and placed in the hands of people who are tech-savvy but not AI experts.”
Reactions have been not just mixed, but polarized. On one end are the commentators predicting which professions will soon be edged out by GPT-3; on the other, underwhelmed experimenters pointing out what GPT-3 cannot do — or cannot do well.
SlatorCon Remote December 2022 | Super Early Bird Now $98
A rich online conference which brings together our research and network of industry leaders.
As Kevin Lacker, Co-founder and former CTO of software company Parse (now owned by Facebook), concluded in a July 6, 2020 blog post: “GPT-3 is quite impressive in some areas, and still clearly subhuman in others.”
Guarding Against Misuse
GPT-3 is substantially more powerful than its predecessor, GPT-2. Both language models accept text input and then predict the words that come next. But with 175 billion parameters, compared to GPT-2’s 1.5 billion, GPT-3 is the largest language model yet.
Can’t help but feel like GPT-3 is a bigger deal than we understand right now— Austen Allred (@Austen) July 17, 2020
OpenAI acknowledged in the February 2019 blog post introducing GPT-2: “Due to concerns about large language models being used to generate deceptive, biased, or abusive language at scale, we are only releasing a much smaller version of GPT-2 along with sampling code.”
OpenAI eventually released GPT-2 in November 2019 “to aid the study of research into the detection of synthetic text,” while acknowledging that extremist groups could fine-tune GPT-2 to produce synthetic propaganda.
Despite that possibility, OpenAI reported that, up until that point, “we haven’t seen evidence of writing code, documentation, or instances of misuse. We think synthetic text generators have a higher chance of being misused if their outputs become more reliable and coherent.”
Hailed by one writer as “the future we’ve been waiting for,” is it possible that GPT-3 has reached that level? Of course, the output of a “black box” text-generating system is only as good as what goes in. Some critics on Twitter have drawn attention to examples of sexist and racist responses to prompts like “women” and “Holocaust.”
Citing research on offensive stereotypes produced by GPT-2, Anima Anandkumar, a director of machine learning research at NVIDIA, tweeted on June 11, 2020, “For @OpenAI to launch this during #BlackLivesMattters is tone deaf.”
Part of the problem, Anandkumar wrote in subsequent tweets, is that GPT-3 is being trained on content from Reddit, which is listed as one of the companies participating in the private beta testing.
The private beta was meant as a preventive measure, allowing OpenAI to vet potential users individually. “We will terminate API access for use-cases that cause physical or mental harm to people, including but not limited to harassment, intentional deception, radicalization, astroturfing, or spam,” OpenAI stated in a June 2020 announcement of the new API allowing access to GPT-3.
Of Limited Use to the Language Industry, For Now
The praise heaped on GPT-3 seems to imply that GPT-3’s abilities run the gamut. One writer on Medium claims it can code “in any language” without additional training. Another presents “conversations” with GPT-3 on topics ranging from the impact of coronavirus to mindful eating.
Manuel Araoz, Co-founder and former CTO of the tech company OpenZepplin, prompted GPT-3 to write an article about itself, which has been making the rounds online.
Aside from issues of potentially biased output, GPT-3 has other shortcomings. As BuzzFeed data scientist Max Woolf pointed out, GPT-3 works fairly slowly, so someone who needs to complete a task quickly might abandon GPT-3 in favor of finishing the task alone.
As long as GPT-3 is only available via the OpenAI API, everyone has the same model, so it gives no one a competitive advantage. And a selection bias toward impressive examples means readers who have been wowed on Twitter rarely see the many instances in which GPT-3 does not perform as desired.
“When I was curating my generated tweets, I estimated 30–40% of the tweets were usable comedically, a massive improvement over the 5–10% usability from my GPT-2 tweet generation,” Woolf wrote. “However, a 30–40% success rate implies a 60–70% failure rate, which is patently unsuitable for a production application.”
The GPT-3 hype is way too much. It’s impressive (thanks for the nice compliments!) but it still has serious weaknesses and sometimes makes very silly mistakes. AI is going to change the world, but GPT-3 is just a very early glimpse. We have a lot still to figure out.— Sam Altman (@sama) July 19, 2020
Even OpenAI Co-founder and CEO Sam Altman called for tempered expectations. “The GPT-3 hype is way too much,” he tweeted on July 19, 2020, noting that “it still has serious weaknesses and sometimes makes very silly mistakes.”
Could GPT-3 have an impact on machine translation? While it will be interesting to see GPT-3 translate, it is unlikely to perform better than a system built specifically for that task. And having more neural network layers is not the solution to the core MT challenges the language industry currently faces.
At this point, GPT-3’s future in the language industry is unclear, as pointed out by Lambda School Co-founder and CEO Austen Allred’s July 16, 2020 tweet: “Can’t help but feel like GPT-3 is a bigger deal than we understand right now.”