Three Supernovae Every Night!

From A Universe from Nothing by Lawrence Krauss.

Go out some night into the woods or desert where you can see stars and hold up your hand to the sky, making a tiny circle between your thumb and forefinger about the size of a dime. Hold it up to a dark patch of the sky where there are no visible stars.

In that dark patch, with a large enough telescope like the type we have in service today, you could discern perhaps 100,000 galaxies, each containing billions of stars. Since supernovae explode once per hundred years per galaxy, with 100,000 galaxies in view, you should expect to see, on average, about three stars explode on a given night.

Wow. How’s that for a quick, Fermi-style back-of-the-envelope calculation? 🙂

Three Ways to Analytic Impact

(cross-posted from the CQuotient blog)

At work every day, we work on analytic problems that are important to retailers. Solving these problems in a better way has the potential for substantial impact.

Very often, our first instinct is to find/devise a better algorithm to throw at the problem. This is not a bad thing to do since the rate of progress in machine learning is high and you never know what powerful new algorithm popped up yesterday (example: Least Angle Regression, an important new prediction algorithm published in 2004, was invented by a researcher as he was reading the 2001 edition of the “data science” bible, Elements of Statistical Learning). Trying the latest and greatest algorithm may well solve the problem better than previous approaches.

However, many problems are stubbornly resistant to this approach. What then?

One of my favorites is to get better data. Not more data, but data that’s different from what has been used to solve the problem so far. If you have used demographic data, add purchase data. If you have both, add browsing data.  If you have numeric data, add text data (aside: in recent work, we have seen very promising results from complementing traditional retail sales and promotions data with text data for customer modeling and personalization).

Peter Norvig, Director of Research at Google and co-author of one of the most lucid textbooks I have had the pleasure of reading (Artificial Intelligence), makes a compelling case for data in The Unreasonable Effectiveness of Data. Anand Rajaraman, a machine learning guru and entrepreneur who writes the insightful datawocky blog, argues that more data beats better algorithms.

Better algorithms, better data. Anything else?

There is. It is re-thinking the problem. (Don’t stop reading, I am not about to inflict on you the “every problem is an opportunity” advice reliably served up by inspirational speakers)

Along with trying new algorithms or adding different data, it is often helpful to step back and confirm that the problem as formulated truly matches what the business cares about.

Take the “customer targeting” problem that arises in direct marketing. Customer targeting is about deciding which customers should be mailed (since mailing every customer is expensive). This is an old problem that has been studied by numerous researchers and practitioners. The most commonly used approach is as follows:

  1. send a test mailing to a sample of customers
  2. use the results of the test mailing to build a “response model” that predicts each customer’s propensity to respond to the mailing as a function of their attributes, past history etc.
  3. use this model to score each customer in the database and mail to the top scorers.

This looks reasonable and may well be what the business cares about. But perhaps not.

The words “response model” suggest that the mailing caused the customer to respond. In reality, the customer may have come into the store and made a purchase anyway (I am thinking of multichannel retailers and not pure-play catalog retailers. For the latter, without the catalog, it may be impossible for customers to make a purchase so the word “response” may be appropriate).

What these response models really do is identify customers who are likely to shop rather than customers likely to shop as a result of the mailing. But may what management really wants is the latter. Fo those customers who are either going to shop anyway or not going to shop regardless of what is mailed to them, mailing is a waste of money and potentially costs customer goodwill too. What the business may really want is to identify those customers who will shop if mailed, but won’t if not mailed.

This re-framing of the customer targeting problem and approaches for solving it are relatively recent. It goes by many names – uplift modeling, net lift modeling – and the academic work on it (good recent example) is quite minimal compared to traditional response modeling. Yet, for many retailers, this is a more relevant and useful way to frame and solve the customer targeting problem than doing it the old way.

One nice thing about re-framing the problem is the likelihood of finding low-hanging fruit. Since the new problem hasn’t received enough attention (by definition), simple algorithms may yield benefits quickly.

In summary, these three roads – better algorithms, better data and a better problem definition – all have merit and play a part in our analytic journey.




Personalized Ads Don’t Always Work?

(cross-posted from the CQuotient blog)

According to a recent MIT study reported in MediaPost, personalized advertising doesn’t always work.

Contrary to popular practice, personalized ads not only don’t drive conversions, but are likely to be ignored, according to the study by MIT Sloan School of Management Prof. Catherine Tucker and London Business School Prof. Anja Lambrecht.

This was very provocative indeed! The key finding of the research was:

When online shoppers were simply looking at a product category, ads that matched their prior Web browsing interests were ineffective. However, after consumers had visited a review site to seek out information about product details — and were closer to a purchase — then personalized ads became more effective than generic ads intended for a mass audience.

I found this to be simultaneously obvious and confusing.

Obvious in that what you show a shopper should (of course!) be tailored to where they are in the purchase process. Confusing because the study effectively assumes that personalization is only about creating product-specific content.

In my opinion, this is a very narrow definition of personalization. The way we think about this at CQuotient, what you say to the customer at a point in time should be tailored to their current state and past history.

For example, if the customer’s current state suggests that they are near the beginning of the purchase process for back-to-school shopping for their 9 year old boy, showing them an ad or presenting them an offer for SKU #823272 (“boys short-sleeve baseball tee”) is suboptimal.

However, an ad along the lines of “Get your child ready for back-to-school. Great selection of boys uniforms and sports-themed casual clothing  in the stores right now”, with images of smart-looking 9-10 year old boys wearing the mentioned merchandise, and a 25%-off-all-purchases-this-weekend coupon, may be just right for that customer.

It may sway her to shop with us rather than the competition, spend the majority if not all of her back-to-school clothing budget with us, and do so this very weekend. A win-win outcome.

Knowing the customer’s current state helped us determine the right level of personalized messaging, and knowing her past purchase history (loves sports-themed casuals, preppy looks, and responsive to coupons) told us what to emphasize in the message and design the right promotional offer.

From this perspective, the finding that customers just beginning the purchase process don’t respond to product-specific ads is neither surprising nor a blow to personalization. But it does underscore the importance of thinking about personalization in a holistic way.


Impact of “Big Data” on Retail: The McKinsey View (Part 2 of 2)

A few weeks ago, I blogged about a recent McKinsey & Company report on the emergence and impact of “Big Data”. I highlighted the retail areas where significant gains may be achievable by harnessing analytics and big data. In this post on the CQuotient blog, I complete my summary of the report. Please head over there if you are interested. Thanks!

Product Personalization: Good or Bad?

(cross-posted from the CQuotient blog)

Personalizing products and offers to suit customers’ unique tastes is a core element of CQuotient’s product focus. So I perked up when I started seeing negative articles on personalization over the past few weeks, triggered by a book called The Filter Bubble: What the Internet Is Hiding From You by Eli Pariser.

In an interview with the New York Times, Mr. Pariser says:

Personalization on the Web is becoming so pervasive that we may not even know what we’re missing: the views and voices that challenge our own thinking.

People love the idea of having their feelings affirmed. If you can provide that warm, comfortable sense without tipping your hand that your algorithm is pandering to people, then all the better.

Personalization channels people into feedback loops, or “filter bubbles,” of their own predilections.

The gist of his argument is that personalization technologies censor what we see. Comfort wins, diversity suffers, and you are worse off as a result.

While some of Mr. Pariser’s comments make sense to me, I do think he is painting with too broad a brush. If thenews we receive is heavily censored by personalization technologies (or anything else for that matter), it is a dangerous thing and worth being vigilant about. But I don’t see how this applies to personalizing productrecommendations.

Personalized product recommendations help you discover things you end up liking that you would have never thought of looking for. It is a great answer to the problem of finding good needles in an almost infinite haystack. Now, as my colleague Graeme Grant points out, the reality is that most forms of personalization out there start and end with addressing the customer by their first name. But personalization stalwarts like Amazon and Netflix (that’s how I stumbled on 24 :-)) have shown what’s possible.

Greg Linden, who was part of the team that created Amazon’s personalization engine, says personalization is all about serendipity

Eli has a fundamental misunderstanding of what personalization is, leading him to the wrong conclusion. The goal of personalization and recommendations is discovery. Recommendations help people find things they would have difficulty finding on their own.

If you know about something already, you use search to find it. If you don’t know something exists, you can’t search for it. And that is where recommendations and personalization come in. Recommendations and personalization enhance serendipity by surfacing useful things you might not know about.

That is the goal of Amazon’s product recommendations, to help you discover things you did not know about in Amazon’s store. It is like a knowledgeable clerk who walks you through the store, highlighting things you didn’t know about, helping you find new things you might enjoy. Recommendations enhance discovery and provide serendipity.

Greg goes on to write that news personalization also promotes serendipity and pulls people out of their comfort zone. I am not sure I agree with him on this point – I am more in agreement with Eli on the potential negative effects of personalizing news.

But when it comes to product personalization, we don’t get a filter bubble. We get a serendipity amplifier. And we can all use more serendipity in our lives.