Lessons from a Deep Learning Master

Photo by Valentin B. Kremer on Unsplash

Yoshua Bengio is a Deep Learning legend and won the Turing Award in 2018, along with Geoff Hinton and Yann LeCun.

In this short post, I want to highlight for you some clever things that Yoshua and his collaborators did to win a Machine Learning competition from a field of 381 competing teams. Perhaps these ideas will be useful in your own work.

In a world where powerful Deep Learning frameworks (e.g., TensorFlow, PyTorch) are a free download away, their competition-winning approach demonstrates nicely that your edge may come from how well you model the specifics of your problem.

Read the rest of the post on Medium.

How to Use Causal Inference In Day-to-Day Analytical Work (Part 2 of 2)

In Part 1, we looked at how to use Causal Inference to draw the right conclusions — or at least not jump to the wrong conclusions — from observational data.

We saw that confounders are often the reason why we draw the wrong conclusions and learned about a simple technique called stratification that can help us control for confounders.

In this article, we present another example of how to use stratification and then consider what to do when there are so many confounders that stratification becomes messy.

Read the rest of the post on Medium.

Data Scientists, Ask Yourself Often: So What?

I used to work at a global management consulting firm many years ago. As a new associate, when I presented the results of my work, I’d often be stopped in my tracks with, “That is interesting. But what is the so what here?”

“So what” was shorthand for several related things.

  • Is there anything actionable here?
  • What should we tell the client to do differently because of this?
  • If we continue down this path, does it get us closer to our ultimate destination?

New associates quickly developed the habit of considering the so-what of their findings before presenting anything. While painful and humbling at first, this turned out to be a very useful habit. It helped us avoid “boiling the ocean”, perform better under time pressure, and made us more productive.

I think data scientists would benefit from cultivating this mindset.

Data science work involves activities that are rife with opportunities to get distracted. Getting the data, exploring the dataunderstanding relationships between variables, formulating a problem, creating a common-sense baseline, building models, tuning hyper-parameters and so on.

Good data scientists tend to be intellectually curious which, of course, is a fantastic thing. But it also means that they are likely to catch the glimmer of a shiny object off to the side of the road and follow it into a rabbit hole. While this is almost always intellectually fun, sometimes it will be useful, sometimes not.

To make sure you are spending your time wisely, you should periodically pause and ask yourself, “What’s the so what here?”.

Is there something concrete and actionable I can get out of this? Does it get me closer to solving the ultimate problem I am working on?

Your answer to the ‘so what’ question doesn’t have to be detailed or exact. It just has to pass a gut check that there’s at least a conceptual path from your current obsession to something useful. If you can’t find a path, you should re-assess if you should switch your focus to something else.

This habit is particularly useful when you start to work on a new problem, especially one posed by someone else and presented to you. As you try to understand and crystallize what exactly needs to be solved, you may come to realize that the problem as defined isn’t actually worth solving because something else is the bottleneck and that needs to be solved first.

Having a so-what mindset gets you to clarity faster and, as a bonus, also builds your reputation in the organization as a pragmatic, clear-thinking data scientist.

All this said, an important caveat.

Ask ‘so what’ in moderation. I am not recommending you become a so-what asking humorless robot.

Going where your curiosity takes you can be useful — you may serendipitously stumble on something valuable in your random explorations. More importantly, it is clearly necessary for one’s happiness. If I couldn’t randomly check stuff out and ‘aimlessly’ play with ideas, I will go crazy.

So explore, follow your curiosity, have fun. But have a background process running in your brain that periodically pops up and asks “what’s the so what here?”.

A Peek into the Incomparable Mind of Isaac Asimov

Isaac Asimov is one of my favorite writers. I recently finished reading It’s Been a Good Life, a compendium of excerpts from his letters, speeches and unpublished writing, curated by his wife Janet Jeppson Asimov.

The book is worth reading in its entirety — it is full of insights, candid self-reflections, pithy statements of his life philosophy, and accounts of pivotal life events. I picked a few below that particularly resonated with me and if they click with you as well, please do read the book.

Read the rest of the post on Medium …

How to Use Causal Inference In Day-to-Day Analytical Work (Part 1 of 2)

Analysts and data scientists operating in the business world are awash in observational data. This is data that’s generated in the course of the operations of the business. This is in contrast to experimental data, where subjects are randomly assigned to different treatment groups, and outcomes are recorded and analyzed (think randomized clinical trials or AB tests).

Experimental data can be expensive or, in some cases, impossible/unethical to collect (e.g., assigning people to smoking vs non-smoking groups). Observational data, on the other hand, are very cheap since they are generated as a side effect of business operations.

Given this cheap abundance of observational data, it is no surprise that ‘interrogating’ this data is a staple of everyday analytical work. And one of the most common interrogation techniques is comparing groups of ‘subjects’ — customers, employees, products, … — on important metrics.

Shoppers who used a “free shipping for orders over $50” coupon spent 14% more than shoppers who didn’t use the coupon.

Products in the front of the store were bought 12% more often than products in the back of the store.

Customers who buy from multiple channels spend 30% more annually than customers who buy from a single channel.

Sales reps in the Western region delivered 9% higher bookings-per-rep than reps in the Eastern region.

Comparisons are very useful and give us insight into how the system (i.e. the business, the organization, the customer base) really works.

And these insights, in turn, suggest things we can do — interventions — to improve outcomes we care about.

Customers who buy from multiple channels spend 30% more annually than customers who buy from a single channel.

30% is a lot! If we could entice single-channel shoppers to buy from a different channel the next time around (perhaps by sending them a coupon that only works for that new channel), maybe they will spend 30% more the following year?

Products in the front of the store were bought 12% more often than products in the back of the store.

Wow! So if we move weakly-selling products from the back of the store to the front, maybe their sales will increase by 12%?

These interventions may have the desired effect if the data on which the original comparison was calculated is experimental (e.g., if a random subset of products had been assigned to the front of the store and we compared their performance to the ones in the back).

But if our data is observational — some products were selected by the retailer to be in the front of the store for business reasons; given a set of channels, some customers self-selected to use a single channel, others used multiple channels— you have to be careful.


Because comparisons calculated from observational data may not be real. They may NOT be a reflection of how your business really works and acting on them may get you into trouble.

How can we tell if a comparison is trustworthy?  Read the rest of the post on Medium to learn how.