The internet is now an essential household utility for many Americans, maybe even on the same footing as running water.
It turns out the internet’s pipes can spring a leak, too.
In the past three weeks, two major outages at Amazon’s cloud computing services have led to widespread disruptions at other online services. Last month, a problem at Comcast, one of the largest internet service providers in the U.S., led to widespread outages. (Comcast owns NBC News.) And in June, websites around the world were temporarily knocked offline when Fastly, a cloud computing service provider, dealt with “service configuration” issues.
The drumbeat of issues underscores that the internet, despite all it’s capable of, is sometimes fragile.
“It’s expected to be like your power or your water, and they sometimes go down,” said Steve Moore, the chief security strategist at the cybersecurity firm Exabeam.
The latest disruption occurred Wednesday, when customers of DoorDash, Hulu and other websites complained that they couldn’t connect. The problems were traced to Amazon Web Services, or AWS, the most widely used cloud services company, which reported that outages in two of its 26 geographic regions were affecting services nationwide.
A similar disruption took place Dec. 7, crippling video streams, halting internet-connected robot vacuum cleaners and even shutting down pet food dispensers in a series of reminders of how much life has moved online, especially during the coronavirus pandemic. AWS published an unusually detailed description of what went wrong, along with an apology.
The incidents helped to explode the illusion, reinforced by decades of steadily improving internet speed and reliability, that everyday consumers can rely on online services to be available without fail.
It used to be that online video meant watching “a low-res video for five minutes,” said Robert Blumofe, the executive vice president and chief technology officer at Akamai Technologies. Akamai sells security services as well as “edge computing” capabilities, a kind of distributed technology that doesn’t rely as much on centralized data centers.
“Now, there’s a very strong expectation that you could watch an entire movie in high-res,” Blumofe said. “There’s a recency bias. We remember the immediate and the now more than we remember the way things were in the past,” when outages were frequent.
In other words, some Americans who enjoy reliable internet access may have become a little spoiled.
Experts in computer science and security said the interruptions don’t really call into question the fundamental design of the internet, one of the founding ideas of which was that a distributed system can mostly continue functioning even if one piece goes down.
But they said the problems are rooted in the uneven development of the internet, because certain data centers are more important than others; cloud businesses run by Amazon, Google and Microsoft concentrate more power; and corporate customers of cloud services don’t always want to pay extra for backup systems and staff members.
Sean O’Brien, a lecturer in cybersecurity at Yale Law School, said the outages call into question the wisdom of relying so much on big data centers.
“‘The cloud’ has never been sustainable and is merely a euphemism for concentrated network resources controlled by a centralized entity,” he said, adding that alternatives like peer-to-peer technology and edge computing may gain favor. He wrote after last week’s outage that the big cloud providers amounted to a “feudal” system.
Cloud service providers make money by selling server space to other businesses on flexible terms and with specialized expertise, reducing the need for companies to manage their own servers. They rarely fail, but they get attention when they do. An AWS outage in November 2020 affected clients like Apple.
“There are many points of failure whose unavailability or suboptimal operation would affect the entire global experience of the internet,” said Vahid Behzadan, an assistant professor of computer science at the University of New Haven.
Some of those points of failure — such as AWS' “us-east-1” region — have become notorious among tech workers who share their experiences on industry message boards.
“The fact that we’ve had repeated outages in a short period of time is a cause for alarm,” Behzadan said, noting that U.S. businesses have staked a lot on the assumption that cloud services are resilient.
He also said that if outages become more common or publicly visible, corporate clients are likely to respond by spending more for backup systems to ensure they’re resilient in case of breakdowns — having contracts with both Google and Amazon, for example. There’s now a rekindled industry debate over whether to go “multicloud,” CNBC reported, and companies across sectors are spending more on edge computing tools.
“The internet will not die any time soon. But whatever won’t kill the internet makes it stronger,” Behzadan said.
Moore, of Exabeam, said the tightening labor market nationwide might also be affecting cloud services and internet reliability, as any increase in churn reduces the experience level of the people in charge.
“We’re coming off unprecedented times where people are incredibly stressed and the expectations for cloud infrastructure have been higher than ever,” he said. “Organizations are playing catch-up.”