The Turing-NLG system contains a whopping 17 billion parameters which make it the biggest model in its class which is known to the public. Google’s BERT model contains 340 million parameters, OpenAI’s GPT-2 has 1.5 billion parameters, and Nvidia’s Megatron-LM has 8.3 billion parameters.
Vole claimed on Monday that, like its rivals, you give Turing-NLG a writing prompt, and it uses this to generate what it predicts are human-like follow-on sentences.
“Microsoft is introducing Turing Natural Language Generation (T-NLG), the largest model ever published at 17 billion parameters, which outperforms the state of the art on a variety of language modelling benchmarks and also excels when applied to numerous practical tasks, including summarization and question answering”, Vole claimed.
Like its predecessors, the 17-billion parameter model is built out of transformers, an AI architecture that processes incoming text and output words in a parallel fashion that takes context into account.
Meaningful or human-like text is tricky for machines to generate because there has to be an appreciation of context: sentences that flit wildly between subjects come across as nonsensical, so there has to be some kind of train of thought, no matter how artificial or vacuous it is.
But Vole is being coy about its hardware and is refusing to release papers about how it works.
Boffins said that they used “an Nvidia DGX-2 hardware setup” and split the model across four V100 GPUs. A total of 256 V100s were required to train the hefty model on 174GB of internet-scraped text, we're told.
Vole has withheld the model, too. A “private demo” has been given to a small number of academics for testing and feedback purposes. It’s possible that Redmond won’t release the model at all, but it hinted that the state-of-the-art results provided “new opportunities for Microsoft and our customers. "We don’t have any details to share about future plans”, a spokesVole said.