On Thursday, OpenAI announced that they had trained a language model. They used a large training dataset and showed that the resulting model was useful for downstream tasks where training data is scarce. They announced the new model with a puffy press release, complete with this animation (below) featuring dancing text. They demonstrated that their model could produce realistic-looking text and warned that they would be keeping the dataset, code, and model weights private. The world promptly lost its mind.
For reference, language models assign probabilities to sequences of words. Typically, they express this probability via the chain rule as the product of probabilities of each word, conditioned on that word’s antecedents Alternatively, one could train a language model backwards, predicting each previous word given its successors. After training a language model, one typically either 1) uses it to generate text by iteratively decoding from left to right, or 2) fine-tunes it to some downstream supervised learning task.
Training large neural network language models and subsequently applying them to downstream tasks has become an all-consuming pursuit that describes a devouring share of the research in contemporary natural language processing.
Continue reading “OpenAI Trains Language Model, Mass Hysteria Ensues”
Whether you are speaking to corporate managers, Silicon Valley script kiddies, or seasoned academics pitching commercial applications of their research, you’re likely to hear a lot of claims about what AI is going to do.
Hysterical discussions about
AI machine learning’s applicability begin with a breathless recap of breakthroughs in predictive modeling (9X.XX% accuracy on ImageNet!, 5.XX% word error rate on speech recognition!) and then abruptly leap to prophesies of miraculous technologies that AI will drive in the near future: automated surgeons, human-level virtual assistants, robo-software development, AI-based legal services.
This sleight of hand elides a key question—when are accurate predictions sufficient for guiding actions?
Continue reading “When are predictions policies?”
[This article is also cross-posted to the Deep Safety blog.]
Something I often hear in the machine learning community and media articles is “Worries about superintelligence are a distraction from the *real* problem X that we are facing today with AI” (where X = algorithmic bias, technological unemployment, interpretability, data privacy, etc). This competitive attitude gives the impression that immediate and longer-term safety concerns are in conflict. But is there actually a tradeoff between them?
We can make this question more specific: what resources might these two types of issues be competing for?
Continue reading “What are the tradeoffs between immediate and longer term AI safety efforts?”
[This article originally appeared on the Deep Safety blog.]
This year’s NIPS gave me a general sense that near-term AI safety is now mainstream and long-term safety is slowly going mainstream. On the near-term side, I particularly enjoyed Kate Crawford’s keynote on neglected problems in AI fairness, the ML security workshops, and the Interpretable ML symposium debate that addressed the “do we even need interpretability?” question in a somewhat sloppy but entertaining way. There was a lot of great content on the long-term side, including several oral / spotlight presentations and the Aligned AI workshop.
[This article originally appeared on the Deep Safety blog.]
Long-term AI safety is an inherently speculative research area, aiming to ensure safety of advanced future systems despite uncertainty about their design or algorithms or objectives. It thus seems particularly important to have different research teams tackle the problems from different perspectives and under different assumptions. While some fraction of the research might not end up being useful, a portfolio approach makes it more likely that at least some of us will be right.
In this post, I look at some dimensions along which assumptions differ, and identify some underexplored reasonable assumptions that might be relevant for prioritizing safety research. In the interest of making this breakdown as comprehensive and useful as possible, please let me know if I got something wrong or missed anything important.
Continue reading “Portfolio Approach to AI Safety Research”
(This article originally appeared here. Thanks to Janos Kramar for his feedback on this post.)
The overall theme of the ICLR conference setting this year could be summarized as “finger food and ships”. More importantly, there were a lot of interesting papers, especially on machine learning security, which will be the focus on this post. (Here is a great overview of the topic.)
On the attack side, adversarial perturbations now work in physical form (if you print out the image and then take a picture) and they can also interfere with image segmentation. This has some disturbing implications for fooling vision systems in self-driving cars, such as impeding them from recognizing pedestrians. Adversarial examples are also effective at sabotaging neural network policies in reinforcement learning at test time.
Continue reading “Machine Learning Security at ICLR 2017”
[This article is cross-posted from my blog. Thanks to Jan Leike, Zachary Lipton, and Janos Kramar for providing feedback on this post.]
This year’s Neural Information Processing Systems conference was larger than ever, with almost 6000 people attending, hosted in a huge convention center in Barcelona, Spain. The conference started off with two exciting announcements on open-sourcing collections of environments for training and testing general AI capabilities – the DeepMind Lab and the OpenAI Universe. Among other things, this is promising for testing safety properties of ML algorithms. OpenAI has already used their Universe environment to give an entertaining and instructive demonstration of reward hacking that illustrates the challenge of designing robust reward functions.
I was happy to see a lot of AI-safety-related content at NIPS this year. The ML and the Law symposium and Interpretable ML for Complex Systems workshop focused on near-term AI safety issues, while the Reliable ML in the Wild workshop also covered long-term problems. Here are some papers relevant to long-term AI safety:
Continue reading “AI Safety Highlights from NIPS 2016”
Last Friday, the University of Ca’ Foscari in Venice organized an IEEE workshop on the Human Use of Machine Learning (HUML 2016). The workshop, held at the European Centre for Living Technology, hosted roughly 30 participants and broadly addressed the social impacts and ethical problems stemming from the wide-spread use of machine learning.
HUML joins a growing number workshops for critical voices in the ML community. These include Fairness, Accountability and Transparency in Machine Learning (FAT-ML), the #Data4Good at ICML 2016, and Human Interpretability of Machine Learning (WHI), held this year at ICML and Interpretable ML for Complex Systems, held this year at NIPS. Among this company, HUML was notable especially notable for diversity of perspectives. While FAT-ML, DS4Good and WHI featured presentations primarily by members of the machine learning community, HUML brought together scholars from philosophy of science, law, predictive policing, and machine learning.
Continue reading “Machine Learning Meets Policy: Reflections on HUML 2016”
[This article is cross-posted from my blog. Thanks to Jelena Luketina and Janos Kramar for their detailed feedback on this post.]
There has been a lot of discussion about the appropriate level of openness in AI research in the past year – the OpenAI announcement, the blog post Should AI Be Open?, a response to the latter, and Nick Bostrom’s thorough paper Strategic Implications of Openness in AI development.
There is disagreement on this question within the AI safety community as well as outside it. Many people are justifiably afraid of concentrating power to create AGI and determine its values in the hands of one company or organization. Many others are concerned about the information hazards of open-sourcing AGI and the resulting potential for misuse. In this post, I argue that some sort of compromise between openness and secrecy will be necessary, as both extremes of complete secrecy and complete openness seem really bad. The good news is that there isn’t a single axis of openness vs secrecy – we can make separate judgment calls for different aspects of AGI development, and develop a set of guidelines.
Continue reading “Clopen AI: Openness in different aspects of AI development”
This morning, millions of people woke up and impulsively checked Facebook. They were greeted immediately by content curated by Facebook’s newsfeed algorithms. To some degree, this news might have influenced their perceptions of the day’s news, the economy’s outlook, and the state of the election. Every year, millions of people apply for jobs. Increasingly, their success might lie in part in the hands of computer programs tasked with matching applications to job openings. And every year, roughly 12 million people are arrested. Throughout the criminal justice system, computer-generated risk-assessments are used to determine which arrestees should be set free. In all these situations, algorithms are tasked with making decisions.
Algorithmic decision-making mediates more and more of our interactions, influencing our social experiences, the news we see, our finances, and our career opportunities. We task computer programs with approving lines of credit, curating news, and filtering job applicants. Courts even deploy computerized algorithms to predict “risk of recidivism”, the probability that an individual relapses into criminal behavior. It seems likely that this trend will only accelerate as breakthroughs in artificial intelligence rapidly broaden the capabilities of software.
Turning decision-making over to algorithms naturally raises worries about our ability to assess and enforce the neutrality of these new decision makers. How can we be sure that the algorithmically curated news doesn’t have a political party bias or job listings don’t reflect a gender or racial bias? What other biases might our automated processes be exhibiting that that we wouldn’t even know to look for?
Continue reading “The Foundations of Algorithmic Bias”