But is it AI?

One discussion I find myself in more often recently is people asking me whether something is “really AI” or not. Often, what people seem to mean with that is whether someone is already using deep learning, or still “just” machine learning.

I mentioned this to a friend in the industry and he just rolled his eyes and said “I know! I’m not even having this discussion anymore!”

And I agree. Maybe it’s my being around for the last 20 years in this community (read “I am old”). To me, Artificial Intelligence always has been the elusive goal, and machine learning has been one attempt to achieve it. And to me, deep learning is really just another tool from the machine learning toolbox (admittedly a very interesting and powerful one).

Deep learning still fits the supervised learning setting. You have a set of input/output example pairs and you’re interested in selecting the parameters of a model in a way to minimize the error of the prediction. The main difference I see between deep learning and other “classical” machine learning methods like support vector machines is that the model is pretty powerful, it is somewhat composable, and you can have preprocessing as part of the model, which you need to do explicitly by hand with classical ML methods.

Because they are conceptually the same, the way you work with these methods to create products is essentially the same: You compile example data, you define how to measure the performance of a model, you train your model on the data, and so on (although the actual technology is different, of course).

Levels of Being Data Driven

For me, the biggest difference in a product is how you use the available data. Some people already call a product “data driven” if you use customer behavior data to analyze the product itself. Having clearly defined KPIs like click-through-rate, analyzing sessions to see where the product does not behave as well as you’d expect it to, such activities are all already some form of being “data driven.”

The key difference when you start employing machine learning methods (including deep learning) is that you don’t just use the data to generate insights so that you can then think about how to improve a product, but you use the data to train models, and thereby, in a way, let the machine learn how to improve the product.

Let us take recommendations as an example. You could in principle hand-code how to compute recommendations. You could show the most popular items from the same category, brand, or color. You could then monitor whether customers click or buy items you recommended to achieve the first level of being data driven. But you could still work on the recommendation algorithm manually. You could identify categories which don’t perform well and then adjust the rules accordingly for these categories.

The second level of being “data driven” is leaving the choice what to show to an algorithm based on customer behavior data from the past. There may be a lot of difference in sophistication, but essentially, everything from collaborative filtering, machine learning, to recurrent neural networks is the same approach to showing recommendations.

Because these different technologies follow the same approach, you can actually compare collaborative filtering to deep learning in an objective manner based on the KPI you have defined. And if collaborative filtering is better, why not stick with it?

What are the Benefits of AI over “classical” ML?

Still, as the questions is coming up consistently, I am wondering what the real underlying question is. There are some areas where deep learning clearly outperforms classical ML methods, but it does not seem to me that that is the reason behind the push for AI.

Leaving out the obvious possibility that some people insist on something “being AI” for the heck of it, one aspect could be the amount of manual intervention necessary. In this thinking, the “more AI” a system is, the less human interaction is necessary to tune the system well.

I wonder whether that is really true. There are cases, like this work by colleagues at Zalando who report that switching from an explicit feature engineering and logistic regression pipeline to recurrent neural networks made training much easier. I am all for reducing complexity, so that is certainly a win.

On the other hand, I think the idea that neural networks do not require any manual insights to construct and use is not true either. Usually, a lot of experience has to go into devising the right kind of architecture, not to speak of the countless hours spent on training these complex models.

If your concern is generally to accelerate the creation of powerful AI based products, I also think that the training of an ML or AI model is only a small aspect of the overall work going into creating a product powered by ML/AI. There is research necessary to create a product that the customer is actually interested in, and there is significant engineering required to set up the architecture to deliver the product.

I have yet to validate my hunch that people are interesting in pushing towards “AI” instead of more classical ML methods is driven by the hope that an “intelligent” system will reduce the dependence on highly skilled (and paid) data scientists. But I am doubtful whether that is really true. To be honest, I think that the industry wisdom that the simplest method is often good enough is still true, so the blind reliance on AI technology might actually increase the cost and time of bringing a product live.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.