If you’re interested in machine learning, or, and especially if you have any involvement in big data, analytics, and related matters, before today is over you must read “Why scientific findings by AI can’t always be trusted,” (Maria Temming, Science News, Vol. 195, No. 7, 4/13/2019).

It describes research by Genevera Allen, a data scientist at Rice University, that attempts to answer a question asked in this space not long ago: With neural networks, which can’t explain their logic when presenting a conclusion, aren’t we just substituting trusting a machine’s guts for our own?

Allen’s conclusion: Yes, we are, and no, we shouldn’t.

Machine learning can, she says, be useful for providing preliminary results humans can later validate. “More exploratory algorithms that poke around datasets to find previously unknown patterns or relationships are very hard to verify,” she explains. “Deferring judgment to such autonomous systems may lead to faulty conclusions.”

Reinforcing the parallel with humans and their guts, Allen points out one of the more important limitations of machine learning: “… data-mining algorithms are designed to draw conclusions with no uncertainty.”

The people I know who trust their guts also seem to lack uncertainty.

Among those who should be less certain are those who figure the so-called “technological singularity” represents the biggest risk AI poses to humanity at large. The singularity — runaway AI where automated improvement cycles beget ever-more-advanced non-biological superintelligences — is the least of our concerns, for the simple reason that intelligence and motivation have little to do with each other.

To choose a banal example, Watson beat all human opponents at Jeopardy. We didn’t see a bunch of autonomous Watsons vying to become the next game-show contestants. Watson provided the ability; IBM’s researchers provided the motivation.

If we shouldn’t worry about the Singularity, what should concern us?

The answer: GPT-2 and, more broadly, the emerging technology of AI text generation.

And is so often the case, the danger doesn’t come from the technology itself. It comes from us pesky human beings who will, inevitably, use it for nefarious purposes.

This isn’t science fiction. The risk is now. Assuming you haven’t been living in a cave the past couple of years you know that Russian operatives deployed thousands of ‘bots across social media to influence the 2016 election by creating a Twitter echo chamber for opinions they wanted spread to audiences they considered vulnerable.

Now … add sophisticated text generation to these ‘bots capabilities.

You thought Photoshop was dangerous? Take it a step further: We already have the technology to convincingly CGI the faces of dead people onto living actors. What’s to stop a political campaign from using this technology to make it appear that their opponent gave a speech encouraging everyone to, say, embrace Satan as their lord and master?

Oh, and, by the way, as one of those who is or soon will be responsible for making your company more Digital,” it likely won’t be long before you find yourself figuring out whether, in this brave new world, it is more blessed to give than to receive. Because while less politically alarming, do you doubt your Marketing Department won’t want to be the last one on their block to have these new toys to play with?

The same technologies our geopolitical opponents have and will use to sell us their preferred candidates for office will undoubtedly help marketeers everywhere sell us their products and services.

How to solve this?

It’s quite certain prevention isn’t an option, although, as advocated in this space once or twice, we might hope for legislation restricting first amendment rights to actual human persons and not their technological agents, and, beyond that, explicitly limiting the subjects non-humans are allowed to speak about while also requiring all non-human messagers to clearly identify themselves as such.

We might also hope that, unlike the currently pitiful enforcement of the Do-Not-Call Implementation Act of 2003, enforcement of the Shut the ‘Bots Up Act of 2019 would be more vigorous.

Don’t hold your breath.

What might help at least a bit would be development of AI defenses for AI offenses.

Way back in 1997 I proposed that some independent authority should establish a Trusted Information Provider (TIP) certification that information consumers could use to decide which sources to rely on.

What we need now is like that, only using the same amplification techniques the bad guys are using. We need something a lot like spam filters and malware protection — products that use AI techniques to identify and warn users about ‘bot-authored content.

Of course, we’d then need some way to distinguish legitimate ‘bot-blocking software from phony alternatives.

Think of it as full employment for epistemologists.

Pop quiz!

Question #1: In the past 20 years, the proportion of the world population living in extreme poverty has (A) almost doubled; (B) Remained more or less the same; (C) almost halved.

Question #2: Worldwide, 30-year-old men have spent 10 years in school. How many years have women of the same age spent in school? (A) 9 years; (B) 6 years; (C) 3 years.

The correct answers are C and A. If you got them wrong, you have a lot of company. Across a wide variety of groups worldwide, faced with these and many more questions with factual answers, people do far worse than they would by choosing responses at random.

Which brings us to the next addition to your KJR bookshelf: Factfulness: Ten Reasons We’re Wrong About the World — and Why Things are Better Than You Think(Hans Rosling with Ola Rosling and Anna Rosling Rönnlund, Flatiron Books 2018). Unlike books that rely on cognitive science to explain why we’re all so illogical so often, Rosling focuses on the how of it. Factfulness is about the mistakes we make when data are available to guide us but, for one reason or another, we don’t consult it to form our opinions. Viewed through this lens, it appears we’re all prone to these ten bad mental habits:

  1. Gaps: We expect to find chasms separating one group from another. Most of the time the data show a continuum. Our category boundaries are arbitrary.
  2. Negativity: We expect news, and especially trends, to be bad.
  3. Extrapolation: We expect trend lines to be straight. Most real-world trends are S-shaped, asymptotic, or exponential.
  4. Fear: What we’re afraid of and what the most important risks actually are often don’t line up.
  5. Size: We often fall for numbers that seem alarmingly big or small, but for which we’re given no scale. Especially, we fall for quantities that are better expressed as ratios.
  6. Generalization: We often use categories to inappropriately lump unlike things together and fail to lump like things together. Likewise we use them to imagine an anecdote or individual is representative of a category we more or less arbitrarily assign them to when it’s just as reasonable to consider them to be members of an entirely different group.
  7. Destiny: It’s easy to think people are in the circumstances they’re in because it’s inevitable. In KJR-land we’ve called this the Assumption of the Present.
  8. Single Perspective: Beware the hammer and nail error, although right-thinking KJR members know the correct formulation is “If all you have are thumbs, every hammer looks like a problem.” Roslund’s advice: Make sure you have a toolbox, not just one tool.
  9. Blame: For most people, most of the time, assigning it is our favorite form of root-cause analysis.
  10. Urgency: The sales rep’s favorite. In most situations we have time to think, if we’d only have the presence of mind to use it. While analysis paralysis can certainly be deadly, mistaking reasonable due diligence for analysis paralysis is at least as problematic.

The book certainly isn’t perfect. There were times that, adopting my Mr. Yeahbut persona, I wanted to strangle the author, or at least have the opportunity for a heated argument. Example:

Question #3: In 1996, tigers, giant pandas, and black rhinos were all listed as endangered. How many of these three species are more critically endangered today? (A) Two of them; (B) One of them; (C) None of them.

The answer is C — none are more critically endangered, which might lead an unwary reader to conclude we’re making progress on mass species extinction. It made me wonder why Roslund chose these three species and not, say, Hawksbill sea turtles, Sumatran orangutans, and African elephants, all of which are more endangered than they were twenty years ago.

Yeahbut, this seems like a deliberate generalization error to me, especially as, in contrast to the book’s many data-supported trends, it provides no species loss trend analysis.

But enough griping. Factfulness is worth reading just because it’s interesting, and surprisingly engaging given how hard it is to write about statistical trends without a soporific result.

It’s also illustrates well why big data, analytics, and business intelligence matter, providing cautionary tales of the mistakes we make when we don’t rely on data to inform our opinions.

I’ll finish with a Factfulness suggestion that would substantially improve our world, if only everyone would adopt it: In the absence of data it’s downright relaxing to not form, let alone express, strongly held opinions.

Not having to listen to them? Even more relaxing.