Factoids, Stories and Insights

Recently, The Economist had a special report titled “Data, data everywhere“. The report examines the rapid increase in data volumes and what the implications are. The report got the attention of the blogosphere (example) and I recommend taking a look if you haven’t already.

When I read articles like these, I try to extract three categories of “knowledge” for future use: factoids, stories, and insights.

  • Factoids are simply data points that I feel might come in handy someday
  • Stories are real-world anecdotes. The most memorable ones have an “aha!” element to them.
  • Insights are observations (usually at a higher level of abstraction than stories) that make me go “I never thought of that before. But it makes total sense.”

Think of this crude categorization as my personal approach to dealing with information overload. Of course, there’s a fair amount of subjectivity here: what I think of as an insight may be obvious to you and vice-versa.

So what did I make of The Economist article? There were numerous factoids that I cut-and-stored away (too many to list here but email me if you want the list), a few memorable stories, and a couple of insights.

Let’s start with the stories.

In 2004 Wal-Mart peered into its mammoth databases and noticed that before a hurricane struck, there was a run on flashlights and batteries, as might be expected; but also on Pop-Tarts, a sugary American breakfast snack. On reflection it is clear that the snack would be a handy thing to eat in a blackout, but the retailer would not have thought to stock up on it before a storm.

Memorable and concrete. Neat.

Consider Cablecom, a Swiss telecoms operator. It has reduced customer defections from one-fifth of subscribers a year to under 5% by crunching its numbers. Its software spotted that although customer defections peaked in the 13th month, the decision to leave was made much earlier, around the ninth month (as indicated by things like the number of calls to customer support services). So Cablecom offered certain customers special deals seven months into their subscription and reaped the rewards.

Four months before the customer defected, early-warning signs were beginning to appear. Nice but not particularly unexpected.

Airline yield management improved because analytical techniques uncovered the best predictor that a passenger would actually catch a flight he had booked: that he had ordered a vegetarian meal.

Hey, I knew this all along! Over 20 years, I have ordered vegetarian meals almost every time and have almost never missed a flight.

Just kidding. This came out of left field, I have never seen it before. While the claim that airline yield management improved substantially due to this single discovery feels like a stretch, the story is certainly memorable.

Sometimes those data reveal more than was intended. For example, the city of Oakland, California, releases information on where and when arrests were made, which is put out on a private website, Oakland Crimespotting. At one point a few clicks revealed that police swept the whole of a busy street for prostitution every evening except on Wednesdays, a tactic they probably meant to keep to themselves.

Worry-free Wednesdays! Great story, difficult to forget.

Let’s now turn to the two insights that stood out for me.

a new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data.

This wasn’t completely new to me (I have friends whose job title is “Data Scientist”) but seeing the sentence in black-and-white crystallized the insight for me and made me appreciate the power of the trend. Particularly the point that the data scientist needs to be at the intersection of programming, stats and story-telling.

As more corporate functions, such as human resources or sales, are managed over a network, companies can see patterns across the whole of the business and share their information more easily.

What the author means by “managed over a network” is “managed in the cloud”. In my experience, data silos are all too common and this often leads to decisions being optimized one silo at a time, even though optimizing across silos can produce dramatic benefit.

I had not appreciated that, as data for more and more business functions gets housed in the cloud, data silos will naturally disappear and it will become increasingly easier to optimize across functions.

Well, that was what I gleaned from the article. If you “extract knowledge” in a different way than factoids/stories/insights, do share in the comments – I would love to know.

Share/Bookmark

7 thoughts on “Factoids, Stories and Insights”

  1. I must first tell you that a majority of the data world does seem just like a home of decorative mirrors in view of locating actuality. There’s a lot of so called pure data floating all over out there but from my expertise it appears like it is every bit of the duplicate rubbish. I am actually scouting for opt-in records that I could utilise for inbox marketing in a few specialized business segments but what i’ve been given up to now appear like it was mearly farmed off of online resources. Most of the blogs and forums nowadays are interesting but I feel dizzy just seeking to sort through all the content of good information.

  2. Fully agree with your point, Kasthuri.

    To get value from cross-silo optimization, many obstacles have to be overcome: technological, organizational, analytical. As each obstacle falls, the bottleneck will shift to one of the others.

    “Data hoarding” is a common organizational obstacle and stems from incentives that, inadvertently, encourage inter-silo competition. My experience is that if technology/analytics can’t be readily blamed (i.e. they are no longer the obstacle), then a lot of executive pressure can be brought to bear on the organization to share data and get on with it.

  3. Rama,

    Thank you for the insightful article!

    Your last point is well taken on the gains to be had on optimizing across the organization. I do wonder if basic human nature will also have to be taken into account. Moving to the ‘cloud” may not necessarily get rid of the data silos because of the natural tendencies to ‘not share’. Can just technology and/or business processes overcome this? Or do we need an organizational culture that promotes sharing data (knowledge)?

  4. Rama,

    Thank you for the thought provoking post. I had never explicitly considered my ‘knowledge extraction’ strategy and your post forced me to reflect upon it. And I love the model you proposed. I have not been able to simplify my approach as neatly but this is how it works for me:

    I scan the article and look to connect the facts and stories with the themes that I already know. Sometimes, an additional factoid, story or opinion crytalizes an existing theme and becomes an insight. Anything that does not fit nicely is a beginning of a new theme.

    As I said a bit messy. Thank you again for the new theme.

    I loved the vegetarian meal story. However, it will be hard for Air India to leverage this insight.

  5. Rama, You have a gift for storytelling, a very Indian gift in a way but also one quite universal. Combine that with modern Analytic insight and it is a pleasure for the modern human drowning in the deluge of data, trying to make sense of her/his world. The theme of this current post is broad and has decades-spanning implications in the way we interact with the world as individuals and how in turn the entities we transact with on a daily basis view us. Bravo! on a very nicely written piece of insight that tells us more than what the original author intended. Badri.

Leave a Reply

Your email address will not be published. Required fields are marked *