The value of information struck me while I was waiting for an elevator.

In the lobby of most buildings each elevator’s position is displayed above its door. Armed with that information you can intelligently decide whether to wait for it or climb the stairs instead.

But elevators give you no positional information on any other floor, forcing you to guess whether waiting or using the stairwell is the better vertical travel decision.

Fixing this is trivial engineering. As the Rutles once sang, all you need is cash.

Not so for the New York subway system, as explained in “Why New York Subway Lines Are Missing Countdown Clocks,” (James Somers, The Atlantic, November, 2015), a charming and fascinating article that explains the answer to the author’s question, “I honestly just wanted to know why the F train didn’t have clocks. I never expected it to be so complicated.” (And thanks to long-time correspondent Leo Heska for calling the article to my attention.)

Turns out, the answer isn’t what’s complicated. The New York subway system relies on early 20th century technology whose architects had a clear and well-chosen design goal: Subway trains must not collide.

To accomplish this goal they engineered an elegant combination of sensors, switches, and on-track displays, through which drivers know whether the next section of track is occupied by another train. If so they slow down. If they don’t, a mechanical relay tied to the train-detection sensor automatically applies the brakes.

The system, that is, was designed for Operations, and relies on a highly decentralized combination of human and automated decision-making. Nothing about it identifies individual trains and their positions, so there’s nothing in it to repurpose to tell passengers when the next train will arrive, let alone support any management analytics.

Solving this is a non-trivial problem.

If these were trucks, a GPS receiver, IoT chip, and Google Maps hack would make it pretty easy. But we’re talking about subway trains. They don’t have line of sight to any GPS satellites, and so, never mind.

Maybe there’s nothing to solve. Knowing each train’s position and velocity is, after all, a luxury, not a necessity. Well, okay, except for this small detail: The entire system is worn out, there’s no source of spare parts, and even the wiring’s insulation is about shot.

Oh, and the estimates for replacing this 1930’s vintage technology with something modern start at $20 billion.

Does any of this sound familiar — a legacy system that would be good enough except its architecture is obsolete, the platforms it runs on aren’t around anymore, and:

  • “Lift-and-shift” replacement provide no new features, and so no business-driven value to justify the expense?
  • Nobody can describe important new features that would justify anything more than a lift-and-shift replacement?
  • Investing in any kind of replacement system would drain needed capital away from other efforts that are also important for the organization’s ongoing survival and success?

Of course it does.

We’re dealing with a linked pair of seldom-discussed IT disciplines: Lifecycle management and migration management. Lifecycle management is about detecting incipient obsolescence and preventing it. Migration management is about becoming excellent at replacing obsolete or near-obsolete systems.

Together they make obsolescence avoidance an operational matter, from both a budgeting and an execution perspective.

Competence at migration management is what makes IT very good at moving from obsolete technology to something modern enough to last a while. Lifecycle management is what says it’s time to repeat the cycle.

Here’s how they might have helped New York’s Metropolitan Transit Authority:

By 1985 (to pick a year out of the air), the subway system relied on 50-year-old technology. Computerization was by then mainstream. The subway control system was clearly obsolete.

So imagine if the MTA had started migrating to a modern system in 1985 through a phased, route-by-route plan.

By now it would probably be time to start the next migration … but it would be from a far better base state, with no looming crises from a lack of spare parts and failing insulation driving a high-risk replacement project along.

Depreciation is the mechanism through which the general ledger depicts how a capital asset … the New York subway system’s control system being an example … loses value over time.

What’s strange is how many business executives consider it an accounting fiction. If they just believed their financial statements they’d bank the funds needed for capital asset replacement as standard operating procedure instead of lifecycle management starting with hat-in-hand supplication.

That’s right: The problem isn’t executives managing by the numbers.

It’s executives choosing to ignore them.

If everyone did everything right from the beginning, it would be easy.

“It,” in this case, is automated regression testing. “Did everything right” is building a strong test plan every time IT put new software into production, and then added it to the existing regression test suite.

Easy peasy.

Easy peasy if, that is, you’re a management consultant like me and not CIO of an actual IT organization. Easier peasier than the alternatives, perhaps, but easy? There’s nothing about software quality assurance that’s easy.

But very few IT disciplines rank as high in importance when it comes to succeeding as a modern IT organization.

The IT pundit class has discovered the need for speed, and about time. Speed, more than any other single factor, is what lets businesses outmaneuver their competitors, thereby profitably selling more products to more customers.

And … in most businesses, most of the time, delivering needed information technology is what limits the pace of change. Speed up IT, speed up the business.

Not there aren’t any number of other factors waiting in the wings to keep things slow, because slow quickly becomes a habit, not a matter of critical path planning. But I digress …

For business to be faster, IT has to be faster at delivering changes to the applications portfolio.

Which in turn means DevOps is in your future, which in its turn means automated regression testing is in your future, if it isn’t an important part of your past and present.

Yes, DevOps. Call me a converted skeptic. Back when folks thought the lead DevOps story was that Dev and Ops were now collaborating, it earned a gigantic ho hum from yours truly.

But as it turns out, DevOps is the least interesting aspect of DevOps. What’s most interesting: DevOps blows up our old understanding of how to move software out of development and into production.

The Standard Model involves bundling software changes into major releases, which then are subjected, not only to the full range of test protocols (unit, integration, regression, stress, and user acceptance), but then have to run the Change Advisory Board (CAB) gauntlet.

The assumption behind this bulky approach to software implementation is that each new release carries with it the potential for blowing up production … either by sucking up server or network capacity so as to slow everything to a crawl; by corrupting one or more corporate databases; by opening up a gaping, easily exploited security hole; or by some other nasty consequence software defects can cause.

Not that these concerns are unfounded. Software defects can cause any or all of these problems, and the bigger the release, the more opportunities there are for bugs to be hiding that can cause them.

What DevOps does that’s truly interesting is stand this equation on its head: Instead of bundling changes into the major releases that create so much risk that drastic measures are called for, it puts changes into production in large numbers of small doses.

And because each release is small, and has been … and this is crucial … subjected to automated regression and stress testing, the risk of it blowing things up is so small that the whole CAB process becomes a fifty buck solution to a five buck problem, as it were.

The magic buzz-phrase is “continuous delivery,” and to give you an idea as to whether “continuous” is an exaggeration or not, way back in 2011 Amazon was making production changes every 11.6 seconds.

This incredibly rapid pace of change lets Amazon test different selling approaches, fine-tuning its merchandising to an astonishing, and, if you’re a competing retailer intimidating extent.

In your case?

Here’s where it gets interesting (or, depending on your level of interest, more interesting): As you know because you’re a regular reader here, there’s no such thing as an IT project, which means there’s no such thing as a software implementation.

This is something even the most advanced DevOps practitioners get wrong. When Amazon deploys its website changes, what’s changing is its selling approach to a large enough fraction of its online customers to provide a valid statistical comparison to its current practices.

When a DevOps team working on an internal application releases changes, for the most part they’re changes to internal business practices.

Which leads to this question: Should DevOps teams just slipstream changes into production as Amazon does on its website? If not …

* * *

Tell you what — I’m not going to do all the work on this. Post your thoughts on how IT should issue changes to internal business systems and the business processes and practices they support as Comments this week, and we’ll continue the conversation in next week’s KJR.