At work every day, we work on analytic problems that are important to retailers. Solving these problems in a better way has the potential for substantial impact.
Very often, our first instinct is to find/devise a better algorithm to throw at the problem. This is not a bad thing to do since the rate of progress in machine learning is high and you never know what powerful new algorithm popped up yesterday (example: Least Angle Regression, an important new prediction algorithm published in 2004, was invented by a researcher as he was reading the 2001 edition of the “data science” bible, Elements of Statistical Learning). Trying the latest and greatest algorithm may well solve the problem better than previous approaches.
However, many problems are stubbornly resistant to this approach. What then?
One of my favorites is to get better data. Not more data, but data that’s different from what has been used to solve the problem so far. If you have used demographic data, add purchase data. If you have both, add browsing data. If you have numeric data, add text data (aside: in recent work, we have seen very promising results from complementing traditional retail sales and promotions data with text data for customer modeling and personalization).
Peter Norvig, Director of Research at Google and co-author of one of the most lucid textbooks I have had the pleasure of reading (Artificial Intelligence), makes a compelling case for data in The Unreasonable Effectiveness of Data. Anand Rajaraman, a machine learning guru and entrepreneur who writes the insightful datawocky blog, argues that more data beats better algorithms.
Better algorithms, better data. Anything else?
There is. It is re-thinking the problem. (Don’t stop reading, I am not about to inflict on you the “every problem is an opportunity” advice reliably served up by inspirational speakers)
Along with trying new algorithms or adding different data, it is often helpful to step back and confirm that the problem as formulated truly matches what the business cares about.
Take the “customer targeting” problem that arises in direct marketing. Customer targeting is about deciding which customers should be mailed (since mailing every customer is expensive). This is an old problem that has been studied by numerous researchers and practitioners. The most commonly used approach is as follows:
- send a test mailing to a sample of customers
- use the results of the test mailing to build a “response model” that predicts each customer’s propensity to respond to the mailing as a function of their attributes, past history etc.
- use this model to score each customer in the database and mail to the top scorers.
This looks reasonable and may well be what the business cares about. But perhaps not.
The words “response model” suggest that the mailing caused the customer to respond. In reality, the customer may have come into the store and made a purchase anyway (I am thinking of multichannel retailers and not pure-play catalog retailers. For the latter, without the catalog, it may be impossible for customers to make a purchase so the word “response” may be appropriate).
What these response models really do is identify customers who are likely to shop rather than customers likely to shop as a result of the mailing. But may what management really wants is the latter. Fo those customers who are either going to shop anyway or not going to shop regardless of what is mailed to them, mailing is a waste of money and potentially costs customer goodwill too. What the business may really want is to identify those customers who will shop if mailed, but won’t if not mailed.
This re-framing of the customer targeting problem and approaches for solving it are relatively recent. It goes by many names - uplift modeling, net lift modeling – and the academic work on it (good recent example) is quite minimal compared to traditional response modeling. Yet, for many retailers, this is a more relevant and useful way to frame and solve the customer targeting problem than doing it the old way.
One nice thing about re-framing the problem is the likelihood of finding low-hanging fruit. Since the new problem hasn’t received enough attention (by definition), simple algorithms may yield benefits quickly.
In summary, these three roads – better algorithms, better data and a better problem definition – all have merit and play a part in our analytic journey.