Staging is a trap - Tyler Cipriani

← GitHub's missing merge option | posts | The purpose of SaaS is selling out →

Software staging clusters only grow.

As production accrues more services, staging’s costs ramp up.

And maintaining a single, massive, production-like staging may no longer be the right answer.

But several, small staging clusters—each fit for their purpose—offers a more maintainable, cheaper alternative.

🍝 There is no perfect staging; there are only perfect stagings

Each reason for having a staging cluster requires a different level of “production-likeness” to fit its use.

You could put Apache and PHP on raspberry pi and call it Wikipedia’s “staging.”

And that’d be a fine place to demo a MediaWiki patch. But to be confident deploying that patch into Wikipedia’s production: (to paraphrase “Jaws” 🦈): you’re gonna need a bigger staging .

In the 1970s, the pasta sauce brand Ragu tasked the psychophysicist Howard Moskowitz with finding the perfect pasta sauce.¹

Moskowitz concluded the perfect pasta sauce doesn’t exist—it depends on each individual’s wants and needs. There is no perfect pasta sauce; there are only perfect pasta sauces.

Howard’s work is why you’ll find Ragu Old World Style® next to Ragu Chunky Garden Vegetable next to 10s of other Ragus.

And much like pasta sauce²: there can be no perfect staging; only perfect stagings.

💱 Staging trades cost vs. production-likeness

The requirements for a staging cluster depend on its use.

Demos – Demoing new code requires the same software, maybe a few microservices, and (possibly) a subset of production data.
- 🟢 Low cost
- 🟢 Low production-likeness
Exploratory testing – QA requires the same software, services, and a subset of production data. It’d be nice to run it on the same type of infrastructure, too.
- 🟡 Medium cost
- 🟡 Medium production-likeness
Deployment confidence – 100% confidence requires a parallel universe you destroy whenever a deployment goes wrong.
- 🔴 High cost
- 🔴 High production-likeness

Staging trades costs for nearness to production

Staging is a trade-off: resources (money, people, time) against asymptotically approaching actual production.

The closer you get to production, the higher the costs and complexity.

🔁 Production should be reproducable

Setting up a staging server should be easy. If it is not easy, you already have a problem in your infrastructure, you just don’t know it yet

– Patrick McKenzie 🐉, Staging Servers, Source Control & Deploy Workflows, And Other Stuff Nobody Teaches You

Configuration management should make it easy to rebuild production from scratch. Otherwise, you’ve got a disaster in the offing.

This creates an environment suitable for demos and end-to-end test automation.

But organizations are evolving away from using pre-production staging to build deployment confidence.

They’ve replaced their high-cost staging with a mix of canary deployments and advanced feature flagging.

Small environments with narrow scope—like testing or demoing—seem like a reasonable trade-off of cost vs. benefit.

But using pre-production staging as insurance for your deployments—requiring snapshots of production data and maybe even replayed traffic—seems too. darn. expensive.

📚 Further reading

This is an anecdote related by Malcolm Gladwell in a 2007 Ted Talk, “Choice, Happiness, and Spaghetti Sauce ↩︎
There’s a joke here about “spaghetti code” that I’m too lazy to find.↩︎