Useful metrics have to satisfy the seven C’s.

Until two weeks ago it was the six C’s (Keep the Joint Running: A Manifesto for 21st Century Information Technology, Bob Lewis, IS Survivor Publishing, 2006). That’s when I found myself constructing a metric to assess the health of the integration layer as part of rationalizing clients’ application portfolios.

In case you haven’t yet read the Manifesto (and if you haven’t, what are you waiting for?), metrics must be connected, consistent, calibrated, complete, communicated, and current. That is, they’re:

> Connected to important goals or outcomes.

> Consistent — they always go in one direction when the situation improves and in the opposite direction when it deteriorates.

> Calibrated — no matter who takes the measurement they report the same number.

> Complete, to avoid the third metrics fallacy — anything you don’t measure you don’t get.

> Communicated, because the biggest benefit of establishing metrics is that they shape behavior. Don’t communicate them and you get no benefit.

> Current — when goals change, your metrics had better change to or they’ll make sure you get your old goals, not your current ones.

The six C’s seemed to do the job quite well, right up until I got serious about establishing application integration health metrics. That’s when I discovered that (1) just satisfying these six turned out to be pretty tough; and (2) six didn’t quite do the job.

To give you a sense of the challenge, consider what makes an application’s integration healthy or unhealthy. There are two factors at work.

The first is the integration technique. At one extreme we have swivel-chairing, also known as integration by manual re-keying. Less bad but still bad are custom, batch point-to-point interfaces.

At the other extreme are integration platforms like enterprise application integration (EAI), enterprise service busses (ESB) and Integration Platform as a Service (IPaaS) that provide for synchronization and access by way of single, well-engineered connectors.

Less good but still pretty good are unified data stores (UDS).

The second factor is the integration count — the more interfaces needed to keep an application’s data synchronized to every other application’s data, the worse the integration score.

Here’s where it gets tricky.

The biggest challenge turned out to be crafting a Consistent metric. Without taking you through all the ins and outs of how I eventually solved the problem (sorry — there is some consulting IP I do need to charge for) I did arrive at a metric that reliably got smaller with better integration engineering and bigger with an integration tangle.

The metric did well at establishing better and worse. But it failed to establish good vs bad. I needed a seventh C.

Well, to be entirely honest about it, I needed an “R” (range), but since “Seven C’s” sounds much cooler than “Six C’s and an R,” Continuum won the naming challenge.

What it means: Good metrics have to be placed on a well-defined continuum whose poles are the worst possible reading on one end and the best possible reading on the other.

When it comes to integration, the best possible situation is a single connector to an ESB or equivalent integration platform.

The worst possible situation is a bit more interesting to define, but with some ingenuity I was able to do this, too. Rather than detail it out here I’ll leave it as an exercise for my fellow KJR metrics nerds. The Comments await you.

The point

The point of this week’s exercise isn’t how to measure the health of your enterprise architecture’s integration layer.

It also isn’t to introduce the 7th C, although I’m delighted to do so.

The point is how much thought and effort went into constructing this one metric, which is just one of twenty or so characteristics of application health that need measurement.

Application and integration health are, in turn, two of five contributors to the health of a company’s overall enterprise technical architecture, the enterprise technical architecture is one of four factors that determine IT’s overall organizational health, and IT health is one of ten dimensions that comprise the overall enterprise.

Which, at last, gets to the key issue.

If you agree with the proposition that you can’t manage if you can’t measure, everything that must be managed must be measured.

Count up everything in the enterprise that has to be managed, and considering just how hard it is to construct metrics that can sail the 7 C’s …

… is it more likely your company is managed well through well-constructed metrics, or managed wrong by being afflicted with poorly designed ones?

It’s Lewis’s metrics corollary: You get what you measure. That’s the risk you take.

Irony fans rejoice. AI has entered the fray.

More specifically, the branch of artificial intelligence known as self-learning AI, also known as machine learning, sub-branch neural networks, is taking us into truly delicious territory.

Before getting to the punchline, a bit of background.

“Artificial Intelligence” isn’t a thing. It’s a collection of techniques mostly dedicated to making computers good at tasks humans accomplish without very much effort — tasks like: recognizing cats; identifying patterns; understanding the meaning of text (what you’re doing right now); turning speech into text, after which see previous entry (what you’d be doing if you were listening to this as a podcast, which would be surprising because I no longer do podcasts); and applying a set of rules or guidelines to a situation so as to recommend a decision or course of action, like, for example, determining the best next move in a game of chess or go.

Where machine learning comes in is making use of feedback loops to improve the accuracy or efficacy of the algorithms used to recognize cats and so on.

Along the way we seem to be teaching computers to commit sins of logic, like, for example, the well-known fallacy of mistaking correlation for causation.

Take, for example, a fascinating piece of research from the Pew Research Center that compared the frequencies of men and women in Google image searches of various job categories to the equivalent U.S. Department of Labor percentages (“Searching for images of CEOs or managers? The results almost always show men,” Andrew Van Dam, The Washington Post’s Wonkblog, 1/3/2019.

It isn’t only CEOs and managers, either. The research showed that, “…In 57 percent of occupations, image searches indicate the jobs are more male-dominated than they actually are.”

While we don’t know exactly how Google image searches work, somewhere behind all of this the Google image search AI must have discovered some sort of correlation between images of people working and the job categories those images are typical of. The correlation led to the inference that male-ness causes CEO-ness; also, strangely, bartender-ness and claims-adjuster-ness, to name a few other misfires.

Skewed Google occupation image search results are, if not benign, probably quite low on the list of social ills that need correcting.

But it isn’t much of a stretch to imagine law-enforcement agencies adopting similar AI techniques, resulting in correlation-implies-causation driven racial, ethnic, and gender-based profiling.

Or, closer to home, to imagine your marketing department relying on equivalent demographic or psychographic correlations, leading to marketing misfires when targeting messages to specific customer segments.

I said the Google image results must have been the result of some sort of correlation technique, but that isn’t entirely true. It’s just as possible Google is making use of neural network technology, so called because it roughly emulates how AI researchers imagine the human brain learns.

I say “roughly emulates” as a shorthand for seriously esoteric discussions as to exactly how it all actually works. I’ll leave it at that on the grounds that (1) for our purposes it doesn’t matter; (2) neural network technology is what it is whether or not it emulates the human brain; and (3) I don’t understand the specifics well enough to go into them here.

What does matter about this is that when a neural network … the technical variety, not the organic version … learns something or recommends a course of action, there doesn’t seem to be any way of getting a read-out as to how it reached its conclusion.

Put simply, if a neural network says, “That’s a photo of a cat,” there’s no way to ask it “Why do you think so?”

Okay, okay, if you want to be precise, it’s quite easy to ask it the question. What you won’t get is an answer, just as you won’t get an answer if it recommends, say, a chess move or an algorithmic trade.

Which gets us to AI’s entry into the 2019 irony sweepstakes.

Start with big data and advanced analytics. Their purpose is supposed to be moving an organization’s decision-making beyond someone in authority “trusting their gut,” to relying on evidence and logic instead.

We’re now on the cusp of hooking machine-learning neural networks up to our big data repositories so they can discover patterns and recommend courses of action through more sophisticated means than even the smartest data scientists can achieve.

Only we can’t know why the AI will be making its recommendations.

Apparently, we’ll just have to trust its guts.

I’m not entirely sure that counts as progress.