Your data smells

You know what a product manager eats for breakfast, lunch and sometimes dinner?


It’s always about the data, and it’s not always palatable.

When I started working as a product manager, I had absolutely no experience with any kind of data analysis tool—hell, I didn’t even know how to do a VLOOKUP in Excel. I remember spending about a week drowning in videos explaining Google Analytics and its features, but I’ve learnt over the past year that using a tool to get data is the easy part. Understanding what to make of the data is the tricky part.

There are several times I’ve nearly torn my hair out because I didn’t know what to make of questionable data. I also found this post (thanks to the Mind the Product newsletter) that was unfortunately all-too relatable. The main problem is that data comes from multiple sources.

Imagine multiple water pipes going into a single water tank. Each pipe is of a different diameter, so the rate at which each pipe fills up the tank is different. The speed at which water comes into the tank is different, but all the pipes together achieve the final goal of filling up the tank. Data is what flows through different pipes, and each “pipe” is a function of the source of data and the unit of measurement. One pipe would have my traffic data (sessions) coming in from Google Analytics, while my other pipe would have the number of users who clicked through an email from Mandrill. Depending on the size of the product and the company involved, this could mean anything between 5 pipes and 50. It can be a nightmare to figure out why a certain metric went up and another went down because they come from different systems. Converting all the metrics to comparable units means even more data is lost; I think of it as a “conversion tax”. What this means though in practice is to always account for weaknesses in the measurement process.

You can have sampling errors, cases where the sample size was just too small, where wrong segments were defined—the scope for errors is omnipresent. Apart from being careful while analyzing data, I have always found it helpful to do a “smell test”. Check if your numbers stink! The easiest way to do this is to compare to historical data. Most numbers don’t change significantly on a daily or even a weekly basis, so if a datapoint went from 0.2 to 20, you know something is wrong. This would be an ideal example though, because usually the differences are comparable enough to be realistic, but also large enough to raise doubt about the correctness of data. In these cases, you have to just go with your gut.

I’ll be the first to admit that my gut has been wrong on several occasions. As the article suggests, when you do decide to go with your gut, the first step is to eliminate bad data and the systems that are contributing to bad data. After that, make an educated guess from the data that you have left at the table. It’s not likely to be 100% correct, but it will be a much closer representation of what the data is actually saying as opposed to what gets lost in translation due to its misinterpretation. Finally, this is something I’d like to record here for posterity

It is critical that managers appreciate that measurement — all measurement — is fraught. Good measurements enlighten, but bad ones mislead.

Thomas C. Redman

Leave a Reply

Your email address will not be published. Required fields are marked *