Natural Language Processing

On Thursday, OpenAI announced that they had trained a language model. They used a large training dataset and showed that the resulting model was useful for downstream tasks where training data is scarce. They announced the new model with a puffy press release, complete with this animation (below) featuring dancing text. They demonstrated that their model could produce realistic-looking text and warned that they would be keeping the dataset, code, and model weights private. The world promptly lost its mind.

For reference, language models assign probabilities to sequences of words. Typically, they express this probability via the chain rule as the product of probabilities of each word, conditioned on that word’s antecedents $p(w_1,...,w_n) = p(w_1)\cdot p(w_2|w_1) \cdot p(w_n|w_1,...,w_{n-1}).$ Alternatively, one could train a language model backwards, predicting each previous word given its successors. After training a language model, one typically either 1) uses it to generate text by iteratively decoding from left to right, or 2) fine-tunes it to some downstream supervised learning task.

Training large neural network language models and subsequently applying them to downstream tasks has become an all-consuming pursuit that describes a devouring share of the research in contemporary natural language processing.

Continue reading “OpenAI Trains Language Model, Mass Hysteria Ensues”

EMNLP – the conference on Empirical Methods for Natural Language Processing – was held this year in Copenhagen, the capital of the small state of Denmark. Nevertheless, this year’s conference had the largest attendance in EMNLP’s history.

The surge in attendance should not be too surprising, as it follows similarly frothy demand for other academic machine learning conferences, such as NIPS (which recently sold out before workshop authors could even submit their papers).

The EMNLP conference focuses on data-driven approaches to NLP, which really describes all work in NLP, so I suppose we can call it a venue for “very data-driven NLP”. It’s a popular conference, and the premier conference of ACL’s SIGDAT (ACL’s special interest group for linguistic data and corpus-based approaches to NLP).

This event went off without a hitch, with plenty of eating and socializing space in the vicinity. For 1200 people. Must’ve been a lot of hard work. Continue reading “A Random Walk Through EMNLP 2017”