Find hidden insights in your data: Ask why and why again | @acotgreave

We’ve all encountered a curious little kid who wouldn’t stop asking “why.” But did you ever think that, in the business world, you should actually aim to be that kid? That’s because people — whether we’re talking about children, journalists, scientists, managers or data analysts — don’t succeed by simply asking “what.”

When Big Data Means Bad Analytics

When analytics delivers disappointing results, it is often because there is not enough analytic expertise, and/or lack of understanding of a business objectives for using Big Data in the first place. To avoid failure, insist on high standards.


Most polling stories go straight to the horserace – they focus on who is ahead and by how much. The problem is that those numbers are misleading, and may even be wrong, in a tight election if they fall within the poll’s margin of error or confidence interval (C.I.). This becomes a big issue in a scary election, such as the one we are currently witnessing.

For example, here is the way the September 11th ABC News/Washington Post poll was reported by aggregator Real Clear Politics:

Clinton has 46 percent support among likely voters in the latest ABC News/Washington Post poll, with 41 percent for Trump, 9 percent for Libertarian Gary Johnson and 2 percent for Jill Stein of the Green Party.

Clinton’s 5-point advantage is within this poll’s margin of error, but it bears up in the context of consistent results among likely voters all summer.

The first paragraph leads the reader to believe Clinton has a percent edge. But, the second paragraph acknowledges a problem with a big C.I., implying it is okay to look at the mean values because other earlier polls showed the same trend. Shame! The author should know better; there is no statistical foundation for the claim that past results, which probably had the same problem, justify overlooking the C.I. dilemma here. As such, no inference about who is ahead can be drawn from this poll.

#knowyourdata #Facebook Video Views Metric ‘error’ occurs in every organization

I see this problem in every organization, and it’s (usually) not intentional. Nor is it is a miscalculation.

Facebook has at least 2 different ways of measuring ‘Average Duration of Videos Viewed’.

1. sum(total seconds viewed) / sum(total users viewed)

This is the calculation the C-suite thought they were reporting.

2. If sum(total seconds viewed) > 3

Then ( sum(total seconds viewed) / sum(total videos viewed) )

Else 0

This is the calculation they were reporting.

The end result?