4 Comments
User's avatar
Spencer Finkel's avatar

If I remember correctly, the AWS console was intermittently reachable (not at all for some people). Making manual fail overs from Route53 extremely difficult, not to mention accessing the health dashboard.

Expand full comment
John S's avatar

Great article comparing the different providers in this way! I enjoyed the cheesy joke at the end, it was a nice personal touch.

Expand full comment
Brian Williams's avatar

Gergely, when you say that all providers promptly acknowledged the incident, how do you define promptly? Where I work, we often have the status page updated to reflect an incident about 15 minutes after it starts. This is the time it takes set up an incident call, understand the user impact, draft communications to customers, and have that draft approved. Some users / customers complain that this is too slow. I'm curious where the industry standard is at here. Would you say that this is a good response time or not?

Expand full comment
Gergely Orosz's avatar

@Brian: this is actually a good point, and one I kind of hand-waved! For the two cloud providers we have logs for: they took 20-30 minutes to acknowledge the issue from starting. AWS took 20 minutes, and GCP took 30 minutes. Let me update this.

Expand full comment