Cloud Diversification and How Amazon Broke the Internet
In the early hours of Monday, October 20th, AWS experienced a DNS resolution failure that had a cascading effect on business and consumer applications across the internet.
In this 7-minute podcast, Deb Boehling from LB3 joins Tony Mangino to discuss what happened and specific cloud diversification strategies that enterprise customers should consider.
If you would like to learn more about our experience in this space, please visit our Information Technology Advisory Services and Technology Consulting & Strategy Development webpages.
Follow us on LinkedIn: TC2 & LB3
Tony
Hello, today is Thursday, October 23, 2025. I’m Tony Mangino from TC2 and this is Staying Connected.
As you may have heard, on Monday a major AWS outage disrupted everything from work apps to home devices. News-flash updates popped up so frequently even I lost track of them. Things are calmer now, but the impact — and the lessons — are not going away.
Joining me today is Deb Boehling, senior partner at LB3. We’ll walk through what happened, how enterprise customers were affected, and — most importantly — what steps you should be taking now, because yes: it’ll happen again.
Deb
Hi Tony, thanks for having me. Let me start with some background on AWS’s latest outage.
In the early hours of Monday, October 20, around 3:11 a.m. ET, AWS’s US-EAST-1 region (Northern Virginia) experienced a DNS (Domain Name System) resolution failure tied to the DynamoDB API endpoint. Put simply for everyone: DNS is like the internet’s address book — translating human-friendly names into IP addresses machines use. When it failed here, applications and services couldn’t find their database endpoints.
The mistake cascaded: the failure in one “core” service triggered trouble across other dependent services. AWS itself says the issue impacted services reliant on that region’s endpoints.
While AWS reports full service restoration by approximately 6:01 p.m. ET, the knock-on effects lasted much longer for many systems.
Tony
That’s brutal. The operational and reputational impacts to businesses — especially those on the East Coast — must have been extensive.
Deb
Absolutely. Living in Northern Virginia myself, I was in the zone — yet personally my impact was minor (just slower systems). But many others were hit far harder.
We saw a broad spectrum of disruptions: business apps like Salesforce, Slack and Heroku, financial apps like Venmo and Coinbase, gaming platforms like Roblox and Fortnite, smart-home devices like Ring, airline apps for carriers like Delta and United, even systems for education and small business.
The truth is: we depend on a handful of cloud-providers to host massive parts of our digital economy. When one of the major hubs stumbles, the ripple is huge. Experts estimate lost productivity and service interruptions could reach into the hundreds of billions of dollars.
Tony
It reminds me of prior incidents like the SolarWinds hack or disruptions at Snowflake Inc. Considering that, I would have thought businesses would have stronger backup plans to avoid this scenario. What should businesses be doing now?
Deb
Yes, many have worked on mitigation measures — but what they work on and how is critical.
For example: AWS holds about ~30% of the global cloud infrastructure market. Many enterprises assume spreading workloads across multiple Availability Zones (within one provider) equals resilience. But in this case, even zone‐diversification within AWS wasn’t enough — because the root failure was at a foundational layer (DNS + DynamoDB in US-EAST-1), which crossed zones.
One clear takeaway: true provider diversification — using more than one cloud provider — matters. We heard from companies that fared better because they used multiple clouds and/or their own datacenters. (For example, a firm noted in the WSJ that it did not go down because of its multi-cloud strategy — though I should note that specific firm was paraphrased for our discussion.)
Some additional strategies might include:
- Spread workloads across different cloud providers (e.g., AWS + Azure + GCP) rather than just multiple zones in one provider.
- Consider maintaining critical apps/data in on-prem or co-location infrastructure you control, or employing hybrid models.
- Ensure your architecture doesn’t rely on a single region or a single provider for “routing” or DNS resolution.
- Conduct regular failure-mode and recovery exercises that simulate cloud provider failure (not just your app component failure).
- Review dependencies of your dependencies (i.e., “If AWS U.S.-East sneezes, will you catch a cold?”). For example: even systems hosted on other clouds (e.g., Office 365) might rely indirectly on AWS infrastructure, due to inter-cloud dependencies.
Tony
So what you’re saying is: don’t just trust that because you’re in AWS multiple zones you’re safe — you need to actively assume part of your provider could go offline, and plan for that.
Deb
Exactly. In fact the October 20 AWS outage exposed how fragile centralized cloud reliance still is — even for the big players. As one expert put it: “When the system couldn’t correctly resolve which server to connect to, cascading failures took down services across the internet.”
It’s a reminder that risk is real, even if you have “good cloud hygiene.” Your architecture must assume failure not just of your app, but of the underpinning provider services.
So yes — we’re not talking about utopia, but about minimizing business and reputational risk. It’s about preparedness.
Tony
Any final words for our listeners, Deb?
Deb
The October 20 AWS outage should serve as a wake-up call: even the biggest cloud provider falters. For businesses, developers and IT leaders, this is an invitation to evaluate your cloud risk, diversify your cloud supply chain, invest in resilient architectures, and test for real-world failure scenarios — because this type of large-scale disruption will happen again.
Don’t wait for the next outage to find out if you’re vulnerable.
Tony – Thanks very much for your time today, Deb! And if you would like to learn more about cloud diversification strategies, or if you’d like to discuss other ICT needs with Deb or me, or any of our LB3 and TC2 colleagues, please give us a call or shoot us an email.
You can also stay current by subscribing to Staying Connected, by checking out our websites, and by following us on LinkedIn.