April 29, 2011 at 11:42 AM ET
Amazon's EC2 Web-hosting service — something on which many popular sites and services rely — suffered some sort of technical problem which took some of our favorite Internet destinations off the map recently. It was a long while before all issues were resolved and some data was permanently destroyed, but Amazon has at least stepped up to the plate to apologize and list the steps which it will take to prevent similar problems from occurring in the future.
First, Amazon plans to give customers the ability to take advantage of multiple Availability Zones, which are like independent server clusters in the cloud, designed to be insulated from failures in other zones in the cloud. Amazon also plans to do a better job at making it easier to deploy a service over multiple availability zones, which is — by its own admission — a daunting task.
Amazon also plans to invest in speedier recovery from failures and improve communication with its clients. “We switched to more regular updates part of the way through this event and plan to continue with similar frequency of updates in the future. In addition, we are already working on how we can staff our developer support team more expansively in an event such as this, and organize to provide early and meaningful information, while still avoiding speculation,” says the AWS team.
Amazon is also making it a point to automatically provide a service credit to customers who were affected by the most recent service outage while expressing sympathy for the troubles they experienced:
Last, but certainly not least, we want to apologize. We know how critical our services are to our customers’ businesses and we will do everything we can to learn from this event and use it to drive improvement across our services. As with any significant operational issue, we will spend many hours over the coming days and weeks improving our understanding of the details of the various parts of this event and determining how to make changes to improve our services and processes.
It's worth noting that Amazon's note did not provide statistics regarding how much more productive people around the world were while popular Internet destinations such as popular link-sharing site Reddit, location-based social network Foursquare, URL shortener ow.ly, and application hosting service Cydia were down.