From AI to ML to AI: On Swirling Nomenclature & Slurried Thought

Artificial intelligence is transforming the way we work (Venture Beat), turning all of us into hyper-productive business centaurs (The Next Web).  Artificial intelligence will merge with human brains to transform the way we think (The Verge). Artificial intelligence is the new electricity (Andrew Ng).  Within five years, artificial intelligence will be behind your every decision (Ginni Rometty of IBM via Computer World ).

Before committing all future posts to the coming revolution, or abandoning the blog altogether to beseech good favor from our AI overlords at the AI church, perhaps we should ask, why are today’s headlines, startups and even academic institutions suddenly all embracing the term artificial intelligence (AI)?

In this blog post, I hope to prod all stakeholders (researchers, entrepreneurs, venture capitalists, journalists, think-fluencers, and casual observers alike) to ask the following questions:

  1. What substantive transformation does this switch in the nomenclature from machine learning (ML) to  artificial intelligence (AI) signal?
  2. If the research hasn’t categorically changed, then why are we rebranding it?
  3. What are the dangers, to both scholarship and society, of mindlessly shifting the way we talk about research to maximize buzz?

In the Beginning, There was “AI”

Well, perhaps not in “the beginning”, but certainly when I entered this world, AI was the preferred term of art to label the broad academic efforts at investigating machine intelligence.

Since the sudden embrace of the term AI reverses a previous rebranding, we ought to ask, what warranted the switch from AI → ML in the first place?

For us to be more than sales-{men,women}, we ought to have reasons for choosing our words beyond those same reasons that motivate marketers. It’s worth asking: what real change justified the initial rebranding of the field from AI to ML? And does this reversal in nomenclature correspond to a 180 in research direction along that same axis?

A brief incomplete history of AI

It’s hard to define AI, but CMU’s Dean of the School of Computer Science Andrew Moore gave my favorite definition to date in this interview with Forbes:

Artificial intelligence is the science and engineering of making computers behave in ways that, until recently, we thought required human intelligence.

It’s a nice definition because it captures (1) the balance of theoretical inquiry and engineering, (2) the breadth of the term, and (3) the sense in which the term is aspirational, a moving target based on those capabilities that humans possess but which machines do not.

Earlier AI work encompassed a broad set of methodological approaches, among which ML was only one. For the uninitiated reader, ML is simply the study of algorithms whose performance on tasks improves as a result of  interactions with data, either via by mining patterns from some compiled dataset (as in supervised learning) or by interacting with a dynamic environment (as in reinforcement learning).

For example, given a large dataset of MRI images and corresponding diagnoses, an ML algorithm might generate a predictive model. Then given subsequent MRI images, the model could predict the probability that an MRI depicts cancer (based on the patterns gleaned from the training data).

Back in the 1980s, many researchers pursued other popular avenues of AI research that had no reliance on pattern recognition. At times, these other approaches attracted wider attention. For example, the IBM supercomputer Deep Blue relied entirely on tree search algorithms and heuristics. Other prominent researchers were focused on rule-based expert systems, attempts to codify the knowledge of domain experts as collections of if-then rules that could be executed automatically by computers. While the hype over expert systems never quite rivaled today’s fawning over machine learning, it did capture the attention of business executives and received coverage in popular venues like Harvard Business Review.

However, (apologies to readers for whom the narrative’s become hackneyed), many of these research sub-communities within artificial intelligence overpromised and underdelivered. In a thought-piece published in AAAI in 1997, UCLA Professor of Computer Science Richard Korf noted that the word artificial intelligence had acquired a “taint” due to “failing to deliver on its early promises”. In the same article, he reveals that while IBM often described Deep Blue as a “supercomputer”, they took pains to explicitly claim “that Deep Blue did not use artificial intelligence”.

Per Korf’s analysis, the name artificial intelligence “itself, … seems much more grandiose and boastful than other early alternatives”. Moreover, he wrote at the time that:

 “work in AI is often thought to be lacking in rigor and without scientific basis. To some extent this a product of the nature of the problem being studied, but it also is the fault of those of us in the field who have not been sufficiently diligent in applying rigorous scientific standards to our work.”

In this climate, the machine learning community, which was burgeoning with activity and (at that time) stood on increasingly sound theoretical footing, sought to differentiate itself by identifying primarily under the name machine learning and publishing primarily in venues more strictly associated with data-driven approaches, e.g. the NIPS, ICML, and COLT conferences, and the Machine Learning Journal, subsequently superseded by the Journal of Machine Learning Research.

The name gave machine learners a fresh slate in the eyes of funding agencies, entrepreneurs, and venture capitalists, and it more closely aligned ML with mathematics and statistics.

My purpose here is not to evangelize the ML approach or to impugn the alternatives. I simply argue that the switch to ML as the preferred identifier corresponded to an actual shift in research direction and that the term itself is reasonable.

Why Switch Again?

If the first name-change came to salvage a discipline amid crisis, then this recent adoption of AI most certainly did not.

Around  6 years ago, when I first thought about applying for a PhD, ML had just begun to enter the public discourse. Some companies, like Google, had long been using ML. Even at Google, the moneymaker was Google Web Search, widely known to rely more on expertly assembled pipelines of traditional information retrieval tools than on ML. The general perception was that two primary paths were open for ML scientists: online advertising and academic research.  Overall, financial prospects appeared worse than riding the web 2.0 wave in SF, a reasonable price to pay for doing more interesting work.

In the last few years, of course, that idea has become quaint. Breakthroughs in deep learning (a subfield of ML) have rendered technologies feasible that power speech recognition and image classification aboard every major smartphone sold today. Each year, more companies acquire data suitable to ML and the algorithms have continued to grow more powerful. Demand for ML talent has never been higher, the tools never better, and the bar for employment never lower. ML technology appears poised to impact ever more businesses.

However, just a couple years after ML entered the public consciousness, despite the absence of any appreciable pivot away from ML approaches, something strange happened: people started calling it AI.

The Sudden Embrace of “AI”

A reasonable person stumbling upon the branding gymnastics might suppose that research must have shifted away from ML and back towards less data-driven approaches.

Puzzlingly, that’s not the case. If anything the dominance, both in academic and industrial spheres, of data-driven methods has only increased. The ML subfield has never consumed a greater share of the AI pie. Instead, perhaps the shift is better explained via the vices that Korf associated with the term AI itself: grandiosity and boastfulness. Perhaps, as Ali Rahimi noted in his test of time talk at NIPS 2017, the shift also signals a renewed lack of rigor.

Below, I outline some specific reasons that I speculate explain why stakeholders, whether consciously or by passively following incentives, are pushing the term “AI” at this moment.


As ML research has transitioned from a primarily academic discipline to a matter of public interest more of the discourse takes place in public venues, through popular articles, press releases, and punditry.  In these shallower dialogues, the discussion of technology is at best gestural. What company is doing big data? Doing ML? Doing AI? How much are researchers doing X getting paid?

Because the technology itself is discussed so shallowly, there’s little opportunity to convey any sense of technological progress by describing the precise technical innovations. Instead, the easiest way to indicate novelty in the popular discourse is by changing the name of the field itself!

What was Google working on 6 years ago? Big data. Four years ago? Machine learning. Two years ago? Artificial intelligence. Two years from now? Artificial general intelligence!

Whether or not the technological progress provides any intellectually sensible justification for relabeling a field of research, readers respond to periodic rebranding. Researchers in turn have an incentive to brand their work under the new name in order to tap in to the press coverage.


The term AI carries 63 years of baggage. While a generation of grizzled scholars might have associated the term with expert systems, the common reader associates AI with an expansive body of fictional references from cinematic and literary science fiction. In most of these works, “AI” isn’t used as the name of a field but rather to refer to specific entities as “artificial intelligences”. They are generally depicted as conscious, anthropomorphic agents.

By using the term AI, journalists and evangelists get their audiences excited at the expense of misleading them. The New York Times, Wired, and other credible sources have full out embraced the term AI as a substitute for ML. As a result, when the charlatans writing for announce “The World’s First Album Composed and Produced by an AI“, audiences are liable to take the bait.


In a recent blog post, Berkeley Professor Michael Jordan took on the hype around AI, and delved a bit into the use of the word itself in the discourse on machine intelligence.  While he goes on to propose addressing an already confused nomenclature by introducing a host of new acronyms (never a good idea), his strongest point to my ears was this:

The current public dialog about these issues too often uses “AI” as an intellectual wildcard, one that makes it difficult to reason about the scope and consequences of emerging technology.

In other words, the lack of specificity allows journalists, entrepreneurs, and marketing departments to say virtually anything they want. The failure to articulate a clean point is weaponized to blunt opposition, similarly to how the president uses the phrase “we’ll see”.


To wrap up, I’d argue that rigor and scholarship don’t just pertain to one’s discipline in making arguments in papers and running proper ablation studies. These are questions of culture. Do we fundamentally value clarity, honesty, and scholarship, or are we willing to do or say anything because it’s presently expedient or trendy?

To put the present argument in context, I wouldn’t rule out the use of the term artificial intelligence. I’m actually quite fond of Andrew Moore’s definition, and as an aspirational term it may be a good word to keep around. For example, in some contexts like academic departments, the term really is used to refer more expansively to a combination of ML and non-ML approaches to building machines that perform tasks that until recently only humans could perform – e.g. the intersection of robotics, planning, and machine learning so fruitfully explored by many of my colleagues at CMU.

Still, this doesn’t absolve us of the responsibility to introspect and to critically consider the abrupt pivot in nomenclature when applied to purely machine-learning-based research. Why are writers who only cover machine learning suddenly called AI journalists? And why are startups that simply apply linear models and/or deep learning in web applications suddenly called AI (vs ML) startups? We ought to examine the underlying motivations, and to consider what this capricious subservience to trendiness says about the culture of scholarship in our field.

The AI Misinformation Epidemic

This post continues a long concentration on the AI misinformation epidemic.

Author: Zachary C. Lipton

Zachary Chase Lipton is an assistant professor at Carnegie Mellon University. He is interested in both core machine learning methodology and applications to healthcare and dialogue systems. He is also a visiting scientist at Amazon AI, and has worked with Amazon Core Machine Learning, Microsoft Research Redmond, & Microsoft Research Bangalore.

16 thoughts on “From AI to ML to AI: On Swirling Nomenclature & Slurried Thought”

  1. Hi, Zach. Great article!
    Just pointing out a (likely) typo: “algorithms have continue*s* to grow more powerful”

  2. Nice post, I always enjoy your writing. FYI in the first paragraph, the CEO of IBM quoted in the article is “Ginni Rometty”

  3. Personally, the language I use to describe my work shifted from ML to AI when my research shifted from working with crafted representations to learned representations. In that sense I’m pushing back against your categorization of deep learning as a subfield of ML. How do you feel about the following distinction: ML is about crafting good representations and using data to learn good mappings into and out of these representations, while current-day AI/DL research values an end-to-end approach, seeking ways to leverage the power of learned representations.
    [Also, thanks for taking the time to write a thoughtful post!]

    1. Thanks for the thoughtful comment Tom! I don’t agree with the main point here that DL, due to ***learning representations*** is anything other than a subcategory of ML. If anything it is precisely ML, and a fuller realization of it than classical pipelines with manually crafted features. The only coherent definition of ML that I know is due to my colleague Tom Mitchell: “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

      So what I’d argue is that this is a problematic tendency we have: when we get good at X we feel the need to brand it as something other than X. It goes beyond the terms “AI” and “ML”. We get good at supervised answer prediction given (question, context) tuples, and suddenly this is not answer prediction, it is “reading comprehension”! When did the research change qualitatively vis-a-vis anthropomorphic traits like comprehension? It didn’t! But the success grants power, even the power to mislead about what we were doing in the first place.

  4. Hello!
    As a normal citizen of korea(non-expert, no well-knowledges in AI) , i read this post and the previous written post(The AI Misinformation Epidemic) with the impression. in Here, 98% AI related articles on internet have treated AI/ML as conscious, anthropomorphic agents. Maybe this is the reason i read your articles with the impression. i started to have some questions about culture(?) in AI/ML, so i ask those on here. Could it be okay?

    1. Why do some researchers/engineers in AI/ML hype the research? without any hype, do they scare to so-called ‘AI winter’? Researchers and Engineers themselves make the fear to the public about AI depicting as a human-like one in my thought.

    2. What are the fundamental, ultimate goal and purpose of AI discipline? Analyzing the human brain?
    I can’t guess what the direction of researchers pursue.

  5. Hello! I read both your posts(AI epidemic, and this one) with the impression.
    i am a non-expert / normal citizen in korea, And here 95% media articles hype AI/ML with depicting the conscious, anthropomorphic agents. So i thought deeply this post.
    At the same time, i want to know something about ML/AI. Maybe these are stupid, but could i ask, and could i get the answers?

    1. Why do some AI/ML researchers hype the research about ML/AI? A kind of making a misconception for some human-like being with anthropomorphism?

    2. What are the fundamental/ultimate purpose and goal of ML/AI research? i am confused to the direction of ML/AI, Why they do archieve some tasks(considered to intellect one) with computers

  6. Hi Zachary,

    Nice article! Very good that you clarified a bit of the history of the terms AI and ML, since most people have no clue. I do the same in my ML and Deep Learning classes at Ghent University.
    As for the reason why the term AI is now so popular: I believe it’s mostly lazyness. Whenever I used to talk to some journalist (who usually has no clue whatsoever) about machine learning, they seemed to imagine something related to mechanical machines and start asking the weirdest questions. If you just say AI, their happy in their impression that they know what you’re talking about. Since most management people in industry equally have no clue, just using the term AI makes everyone’s life much easier. That’s my guess. Maybe we should take up ‘Artificial Learning’ instead of ML.

  7. I have also wondered why everyone switched from describing their work as ML to describing it as AI. To what extent was this driven by startup companies trying to raise money? Many ML startups have provocative names such as DeepMind, Sentient, Brain Corp, AI Brain, and Numenta.

    Of course, Google also created Google Brain in-house. Why choose that name?

    1. Funny! While I’ve known Jeff Hawkins to be a bit aggressive in overselling what Numenta produces, I didn’t know what you meant about it being a provocative name until I just looked it up on Wikipedia, finding:

      “The company name comes from the Latin mentis (“pertaining to the mind”) genitive of mēns (“mind”)”

  8. A good article.
    I learned of your blog in the “Monde diplomatique” of August 2018.
    Concerning “A brief incomplete history of AI” and the mention of the “expert systems” technology: may I say that all this expert systems were based on research on the use of rule-based systems for the simulation of human cognitive activities.
    Newell and Simon of CMU were pioneers in this field and, at the time, there was no commercial incentives governing these research activities and the students involved. For example, the early works of J.R. Anderson (psychology CMU) were exemplary for the comprehension of language development in children. That was quite important in the Ph.D. thesis of my wife concerning the “art” of psychological diagnostic of eating disorders (that and the “protocol analysis” methodology of Simon).
    Also, a lot of this spirit was “transferred” to the Université de Montréal in the person of George Baylor (a student of Herbert Simon) and Patrick Cavanagh and Jean Gascon (all CMU’s graduates).
    I was a student of those three (mainly Cavanagh for my Ph.D.) and I still believe that there is a future for the fine simulation of human cognitive activities: it is a fundamental aspect of AI research but, alas, not profitable on a short-term basis and generally hard to fund…
    I am getting older by the day but I am glad to adhere to this slogan: Long live protocol analysis and rule-based simulation!
    Martin GAGNON Ph.D.
    Professor (retired)
    Université du Québec à Montréal

  9. “Slurried Thought”–apropos–and certainly in the maelstrom of “civilization” it has ever been the norm. I found your article as I despaired of popular idiocy regarding machine intelligence specifically; and 1st for solace I visited Chaucer: “O stormy peple! Unsad and evere untrewe! Ay undiscreet and chaungynge as a fane!” then 2nd I entered into the Google Search Bar: “Nick Bostrom is full of sh_t”– which returned your essay here, so proof Google is intelligent. May your noble admonitions be remembered.

    But an apology for the Utopians: recent victories in the automation of perceptual functions reveal that in one aspect, applied ML transcends the ancient demarcations of statistics and discrete math–formal reasoning itself has never presumed (outside philosophy) to explain or to supply such fundamental cognitive phenomena as visual or auditory recognition–but it now becomes apparent that these are provided in the approximate solutions to high dimensional (non-linear & non-convex..) optimization problems, developed & attainable through the techniques of “ML”. We simple-minded primates might be forgiven then for prematurely apprehending the dawn of an age of intelligent machines.! 🙂

  10. Thanks for the clarity. In radiology, we tried using ML and CVML but interviewers and even journals often changed it to AI. Most radiologists, whose domain is neither math nor CS based, seem to perk up to ‘AI’ and dim to ‘ML.’. As a result, we gave up and use AI. How much and what kind of damage, if you will, might our lack of rigor cause?

    Also curious your advice on how we may improve communication and collaboration among true academic ML community and us wanna-be, math-light, ML radiologists.

  11. Hi Zachary. Great set of articles! I have been banging on that the real ‘danger’ we face from ‘AI’ is its commercially driven premature deployoment in situations of real consequence, such as prison tariffs and autonomous vehicles. Also that the danger that ‘AI’ faces from us is the real possibility of another AI Winter due to over-hype (I remember the first one in the 80s). You nicely captured both those scenarios.

    Only one criticism. Wherever you have used the term ‘methodologies’, you really mean ‘methods’. As a fellow PhD holder, I always feel the need to defend this word, the very specific meaning of which is disappearing through misuse.

Leave a Reply

Your email address will not be published. Required fields are marked *