Exploring GPT-3, the World’s Largest Language Model

Paymon
The Startup
Published in
3 min readDec 17, 2020

--

OpenAI’s release of GPT-3 was a major event within the machine learning and data engineering space. At 175 billion parameters, this was by far the largest Transformer-based language model in the world (Microsoft’s Turing-NLG comes in second at 17 billion). At 1.6 billion parameters, GPT-2 was so good at generating realistic text that OpenAI refused to make the weights open-source, concerned about the creation and spread of fake news.

Chat playground

I entered the chat playground and started conversing with GPT-3 on a variety of topics. When approached with controversial social issues, it held a surprisingly neutral stance. When asked “What is the best way to use GPT-3” its first response was “Create a DOTA 2 bot that could compete against professionals.” It didn’t seem to have access to public records — it could only recognize celebrities or historical figures. It had difficulty with spatial visualization. But the chat partner maintained a consistent style and perspective for the duration of the conversation.

Comparison to a human

The human brain has roughly 100 to 500 trillion synaptic connections. If you mapped a neural network parameter to each one (neurons are far more complex than their software counterpart), you could estimate that GPT-3 is about 2000x smaller. It also doesn’t work with audio or visual input.

OpenAI has released music writing engines (Musenet) and image completion software (Image-GPT) using similar technology. I’m not an expert in the AI space, but there are people who are who are freaking out over this release. Maybe the terminator isn’t as far away as we think.

Pricing

The primary engine, Davinci, costs $0.06/1k tokens (~3000 words) to use in a commercial setting. Less powerful alternative engines go as low as $0.0008/1k tokens, but are heavily lacking in capabilities compared to Davinci. With all the chat bots out there, GPT-3 must have something to justify its price. The value lies in its flexibility and unsupervised learning style. Most AIs are tuned for a specific task, but GPT-3 can do anything from write code to provide a medical diagnosis. It can learn on the spot similar to the way a human does.

Using the API

I created a chrome extension that implemented one of its advertised functions — text simplification. Simplify the advanced text in any research article so that a second grader could understand it.

Right clicking on a paragraph will generate a simplification

Even if you removed the cost of the API call, the paragraph takes almost a minute to simplify and is inconsistent in accuracy. Two API calls with identical parameters could produce vastly different results. Perhaps a more invested developer could tweak the parameters so that more consistent responses were returned, but that still doesn’t address the issue with time. Google translate can do an entire page in the blink of an eye — with a max token limit of 2048 per call this page would take at least 10 minutes to simplify.

I could see a realistic implementation of the app if it was restricted to common PDF documents and used a persistent data store. The most accurate simplifications are very impressive — explained in simple terms far better than most humans could.

--

--