The hubris of bad metrics

Like Tweet Pin it Share Share Email

Before we get started, a correction. Last week, due to too many re-writes, I ended up posting backward logic, as several correspondents pointed out. You’ll find a corrected version here, near the bottom, in the Bob’s Last Word segment.

# # #

Speaking of bad metrics (we weren’t, but I couldn’t come up with a better segue to this week’s topic), let me offer a big thank-you to Lee Neville, a long-time member of the KJR community, for bringing a high-profile example to our attention.

Titled “At N.F.L. Draft, America Begins Annual Tradition of Celebrating Hubris,” it’s by David Leonhardt and appeared in The New York Times, April 28, 2022. Annoyingly enough, I’m pretty sure Leonhardt’s core conclusion is at least partially on target: with lots at stake, and huge investments in data-gathering and analysis, the correlation between NFL draft rankings and player performance isn’t very good.

Leonhardt ascribes the problem to hubris, and extrapolates his conclusion to business hiring, which he suspects is just as unreliable, and for similar reasons.

So what’s the problem? Leonhardt bases his conclusion on the five 2018 first-round quarterback draft picks. Using career touchdowns as his metric, he demonstrates conclusively that actual career performance and draft order have little to do with each other.

I’m just messin’ with you. He did no such thing.

The illogic of his commentary began with his choice of career touchdowns as his quarterback performance metric. It conceals a wealth of missing but important information that’s critical to evaluating Leonhardt’s contention. For example:

Did all five quarterbacks play the same number of games? The answer is no. Baker Mayfield, for example, didn’t play until partway through week three of the 2018 season. In all he’s played 60 games, Sam Darnold has played 50. Josh Allen … the quarterback with the most touchdowns has, suggestively, played the most games at 61. That leaves Josh Rosen having played 24 games and Lamar Jackson 58.

It doesn’t take a metrics nerd to know that playing in fewer games means having fewer opportunities to score touchdowns.

Did all five of these quarterbacks enjoy the same level of protection? Every football fan knows that the better the offensive line, the more time the quarterback has to execute plays and the more successful he’ll be.

And yet, with all the zillions of statistics football game callers give us to fill time as part of their commentary, nobody seems to measure the average time between the snap and the moment the quarterback throws the ball or is sacked (limiting analysis to passing plays only for clarity). A similar point could be made about the caliber of the team’s receivers and running backs.

ESPN take note.

How about the quality of coaching? While some quarterbacks do call some plays, the coaching staff create the game plan and call a lot of the plays as well. Presumably, some coaches are better at this than others, which means some quarterbacks are the beneficiaries of better game plans and play calling than others.

A sample size of five? Seriously? With all the data available for football, getting to the magic number of 30 data points – the minimum needed for general-purpose statistics – wouldn’t have been all that difficult. Statistically speaking, a sample size of five is just pretending, especially because … why choose the 2018 draft for analysis and not some other year, anyway?

Bob’s last word: Picking on Leonhardt is fun, but it isn’t entirely fair. Far too many of his fellow reporters and opinion writers of all stripes just aren’t very good with math or statistics either, whether they cover sports, politics, management, or information technology. We can hope the level of sophistication among journalists who cover the fields of math and statistics is better.

Then there’s Leonhardt’s conclusion – that recruiting in all fields is a matter of hubris. It would be convincing if he offered a better alternative. So yes, recruiting and selecting the best candidates to hire is an imperfect science at best. That doesn’t mean a high failure rate is due to character flaws all around.

It means it’s hard.

Bob’s sales pitch: Schrodinger’s cat is alive and well, as will be revealed on May 11th, 2:40pm CST. That’s when a battle royale will ensue, as I engage with the estimable Roger Grimes in The Great Quantum Debate: Is There a Role in Business Yet? as part of CIO’s Future of Data Summit.

Oh, okay, it won’t be a battle royale, but there’s a pretty good chance you’ll enjoy it almost as much as Roger, Eric Knorr – our moderator – and I did when we recorded it.

Comments (5)

  • in reading this piece, a context I know little about, my first thought was “Moneyball for football’, my 2nd was ‘Tom Brady’, and third was “men”! The hubris, of course, was not limited to the men associated with the draft, but also Leonhardt’s.

  • You might also have mentioned that Leonhardt’s data would in no way account for the ‘another team’ success of quarterback’s Mathew Stafford and Ryan Tannehill. That’s why it’s called a team sport.

    • I might have mentioned it, but it didn’t occur to me. That’s why KJR is also a team sport!

      Thanks for making the point. It’s a good one.

  • This can even be carried a little farther. I have long maintained that there is not one football statistic at the individual level that makes any sense whatsoever. A quarterback can’t complete a pass unless he has a competent receiver; a receiver can’t catch a pass unless he has a competent quarterback. A running back depends on well-executed laterals and handoffs. A place kicker needs a good snap, a good holder, and good blockers. And on and on — the only play in a football game that be measured in isolation is a kickoff (only the kick, not the return), and that’s probably the one thing that doesn’t get measured!

    There’s only one statistic that matters in football (or any sport, for that matter): how many times did your team beat the opponents, vs. how many times you lost?

    Sports are the epitome of bad metrics, but they’re certainly not the only category!

    • Good point, and it extends nicely to the business world as well: With few exceptions, business metrics are process metrics, and healthy processes are generally a team proposition. What makes this particularly challenging is that while performance is a team property, there still are some employees who are more effective than others.

      But turning that into measurements is a messy proposition at best.

Comments are closed.