Major Cloud Providers – Monthly Outage Recap October
MS Azure
10/30/19 RCA – Storage – West EuropeBetween 10:17 and 13:52 UTC, a subset of customers using Storage, or Azure services with Storage dependencies, in West Europe may have experienced difficulties connecting to resources hosted in this region.
10/25/19
–Azure Databricks and Data Factory v2 – Workspace and Data Flow ErrorsBetween approximately 11:00 and 14:40 UTC, a subset of customers using Azure Databricks may have received ‘No Web App available’ error notifications when logging into a Databricks workspace. Related API calls may have also not returned a response. Additionally, a very limited subset of customers using Data Factory v2 may have received failure notifications for Data Flow jobs.
-DownDetector Problems at Microsoft AzureMicrosoft Azure is having issues since 10:29 AM EDT.Most reported problems
42% Cloud services
40% Website hosting
17% Virtual machines
10/22/19
–RCA – Service Availability – West EuropeBetween 21:15 UTC on 22 Oct 2019 and 00:23 UTC on 23 Oct 2019, a subset of customers using Storage in West Europe may have experienced service availability issues. In addition, resources with dependencies on the impacted storage unit may have experienced downstream impact in the form of availability issues or high latency.
–Azure Portal Sign In Failure – West EuropeBetween 06:40 UTC and 09:54 UTC a subset of customers were identified as having experienced issues signing in to the Azure Portal in West Europe.
10/21/19
–RCA – Storage – West EuropeBetween 23:20 UTC on 21 Oct 2019 and 04:32 UTC on 22 Oct 2019, a subset of customers using Storage in West Europe may have experienced service availability issues. In addition, resources with dependencies on the impacted storage unit may have experienced downstream impact in the form of availability issues or high latency.
–RCA – Service Management Errors – East US 2Between 10:08 and 23:37 UTC, a subset of customers in East US 2 may have received error notifications or experienced high latency when performing service management operations – such as create, update, or delete – for resources hosted in this region. Additionally, customers using Azure Databricks and/or Data Factory v2 may have encountered service management errors in multiple regions. A very limited subset of customers using Virtual Machines with SQL Server images, or other SQL IaaS offerings, may have also encountered errors performing service management operations on resources hosted in East US 2.
10/18/19 RCA – Authentication issues with Azure MFA in North AmericaBetween 13:30 UTC and 15:57 UTC, customers in North America experienced issues receiving multi-factor authentication (MFA) challenges. Users who had valid MFA claims during the incident were not impacted. However, users who were required to perform an MFA challenge during this incident were unable to complete the challenge. This represented 0.51% of users in North American tenants using the service during the incident.
DownDetector
10/18/19 Problems at Microsoft AzureMicrosoft Azure is having issues since 10:05 AM EDT. Most reported problems
60% Cloud services
35% Website hosting
3% Virtual machines10/16/19 RCA – Azure PortalBetween 13:45 and 14:59 UTC (approx.), a subset of customers may have experienced latency issues with the Azure Portal, Command Line, and Azure PowerShell.
AWS
10/24/19 VMware Cloud on AWS: Host Remediation Degradation VMware Cloud on AWS (VMC) experienced an operational issue that caused us to inadvertently remediate hosts that had not actually failed. Customers may have noticed an unusual amount of host replacement activity due to this error. This incident affected: VMware Cloud on AWS
Start Time (UTC): October 24, 2019, 22:20 hours
End Time (UTC): October 25, 2019, 22:25 hours
Duration: 24hrs 5mins
DownDetector
10/23/19 Problems at Amazon Web Services Amazon Web Services is having issues since 8:48 AM EDT.Most reported problems
63% S3
29% AWS Console
6% Route53
DownDetector
10/22/19 Problems at Amazon Web Services Amazon Web Services is having issues since 2:56 PM EDT. Most reported problems
73% S3
18% AWS Console
8% Route53
10/15/19 VMware Cloud on AWS: SDDC Capacity issue in Asia Pacific Southeast Region New SDDC provisioning may fail for Asia Pacific South East (Singapore). This incident affected: VMware Cloud on AWS SDDC (Asia Pacific (Singapore)).
Start Time: October 15, 2019, 08:32 UTC
End Time: October 18,2019, 07:51 UTC
10/14/19 VMware Cloud on AWS: SDDC provisioning failures in US West (Oregon) Region New SDDC provisioning may fail in US West (Oregon). This incident affected: VMware Cloud on AWS SDDC (US West (Oregon)).
Start Time: October 14, 2019 07:56 UTC
End Time: October 15, 2019 07:08 UTC
10/10/19 VMware Cloud on AWS: SDDC provisioning failures in US East (N. Virginia) region New SDDC provisioning or adding hosts/clusters may fail in US West (Oregon). This incident affected: VMware Cloud on AWS SDDC (US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney)).
Start Time: October 10, 2019 18:00 UTC
End Time: October 10, 2019 15:14 UTC
10/5/19
–VMware Cloud on AWS: Unable to access the Network and Security tab Customers will not be able to view Network and Security tab. This incident affected: VMware Cloud on AWS (VMware Cloud on AWS).
Start Time: October 05, 2019 04:20 UTC
End Time: October 05, 2019 11:20 UTC
–VMware Cloud on AWS Service Performance Degradation issue User may not be able to access the service or may experience trouble while accessing the service. This incident affected: VMware Cloud on AWS (VMware Cloud on AWS, DRaaS, HCX).
Start Time: October 05, 2019 01:00 UTC
End Time: October 05, 2019 01:34 UTC
10/4/19
-Amazon CloudWatch (Oregon) Elevated Error Rates Starting at 5:27 PM PDT, there were delays delivering scheduled CloudWatch events. Recovery began at 7:36 PM PDT, and by 9:22 PM PDT the backlog of all previously scheduled CloudWatch events was successfully delivered.
-Amazon CloudWatch (N. Virginia) Elevated Error Rates Starting at 5:27 PM PDT, there were delays delivering scheduled CloudWatch events. Recovery began at 7:31 PM PDT, and by 9:28 PM PDT the backlog of all previously scheduled CloudWatch events was successfully delivered.
10/2/19 VMware Cloud on AWS: SDDC provisioning failures New SDDC provisioning or adding hosts/clusters may fail in US West (Oregon). This incident affected: VMware Cloud on AWS SDDC (US West (Oregon)).
Start Time: Tuesday, 1 October 2019, 22:28 UTC
End Time: Wednesday, 2 October 2019, 20:56 UTC
Other Platforms
10/31/19
-Google Cloud infrastructure components Incident #19001 Our engineers have determine issue to be linked to a single Google incident. Incident began at 2019-10-31 16:41 and ended at 2019-11-02 10:00 (US/Pacific).
-Google Cloud Composer Incident #19003 Our engineers have determined that Cloud Composer was impacted by the same underlying issue as the Google Compute Engine (GCE) incident. Incident began at 2019-10-31 16:41 and ended at 2019-11-01 12:41 (US/Pacific).
-Google Cloud Networking Incident #19021 Our engineers have determined that Google Cloud Networking was impacted by the same underlying issue as the Google Compute Engine (GCE) incident. Incident began at 2019-10-31 16:41 and ended at 2019-11-02 10:00 (US/Pacific).
-Google Cloud Machine Learning Incident #19002 Our engineers have determined that Cloud ML Engine was impacted by the same underlying issue as the Google Compute Engine (GCE) incident. Incident began at 2019-10-31 16:41 and ended at 2019-11-01 09:00 (US/Pacific).
-Google Cloud Memorystore Incident #19002 Our engineers have determined that Cloud Memorystore was impacted by the same underlying issue as the Google Compute Engine (GCE) incident. Incident began at 2019-10-31 16:41 and ended at 2019-11-01 07:41 (US/Pacific).
-Google Cloud DNS Incident #19003 Our engineers have determined that Google Cloud DNS was impacted by the same underlying issue as the Google Compute Engine (GCE) incident.Incident began at 2019-10-31 16:41 and ended at 2019-11-01 08:15 (US/Pacific).
-Google Cloud Datastore Incident #19005 We’ve received a report of an issue with Cloud Datastore / Firestore in us-central. Incident began at 15:30 and ended at 16:01 (US/Pacific).
10/27/19 VMware Cloud Services Functionality Issues Intermittent issues with functionality of our VMware Cloud Services.
Start Time: October 27, 2019 21:48 UTC
End Time: October 28, 2019 01:50 UTC
10/24/19 VMware Cloud on AWS: Host Remediation Degradation VMware Cloud on AWS (VMC) experienced an operational issue that caused us to inadvertently remediate hosts that had not actually failed. Customers may have noticed an unusual amount of host replacement activity due to this error.
Start Time (UTC): October 24, 2019, 22:20 hours
End Time (UTC): October 25, 2019, 22:25 hours
Duration: 24hrs 5mins
10/22/19
-SAP Cloud Platform Europe (Netherlands) [cf-eu20] – Service Advisory Since approximately 21:13 UTC on 22 Oct 2019 until 03:04 UTC on 23 Oct 2019 Lifecycle management operations cannot be executed. Application logs in Kibana are unavailable.
-SAP Cloud Platform Europe (Netherlands) [cf-eu20] – Service Advisory Since approximately 23:20 UTC on 21 Oct 2019 until 04:42 UTC on 22 Oct 2019 Data processing for IoT message producers and consumers is not possible
-Google Cloud Bigtable Incident #19001 Our engineers have determined this issue to be linked to the Google Cloud Networking incident in us-west1-b. Incident began at 17:04 and ended at 20:04 (US/Pacific).
-Google Cloud Networking Incident #19020 Google Compute Engine experienced 100% packet loss to and from ~20% of instances in us-west1-b for a duration of 2 hours, 31 minutes. Additionally, 20% of Cloud Routers, and 6% of Cloud VPN gateways experienced equivalent packet loss in us-west1. Incident began at 16:47 and ended at 21:35 (US/Pacific).
-Cloud Memorystore Incident #19001 Our engineers have determined this issue to be linked to the Google Cloud Networking incident in us-west1-b. Incident began at 17:14 and ended at 19:05 (US/Pacific).
-Google Cloud Storage Incident #19008 Incident began at 17:11 and ended at 20:04 (US/Pacific).
10/20/19 VMware Cloud PKS Performance Degradation issue Customers may experience delays or failures in creating or deleting clusters in all regions.
Start Time: October 20, 2019 17:05 UTC
End Time: October 20, 2019 18:33 UTC
10/19/19 SAP Cloud Platform US East (Ashburn) [neo-us1] – Service Advisory Since approximately 07:32 UTC until 08:11 UTC Git Service and HTML5 application deployment are unavailable
10/18/19
-SAP Cloud Platform US East (Ashburn) [neo-us1] – Service Advisory Since approximately 22:03 UTC until 23:02 UTC Forms by Adobe is unavailable
-SAP Cloud Platform US East (Ashburn) [neo-us1] – Service Advisory Since approximately 14:51 UTC until 18:31 UTC A general disruption is impacting the availability of applications and services
10/15/19
–VMware Cloud on AWS: SDDC Capacity issue in Asia Pacific Southeast Region New SDDC provisioning may fail for Asia Pacific South East (Singapore).
Start Time: October 15, 2019, 08:32 UTC
End Time: October 18,2019, 07:51 UTC
–VMware Skyline Service Availability issue Users may not be able to access the service or may experience trouble while accessing the service.
Start Time: October 15, 2019, 03:48 UTC
End Time: October 15, 2019, 06:54 UTC
10/14/19
–VMware Cloud on AWS: SDDC provisioning failures in US West (Oregon) Region New SDDC provisioning may fail in US West (Oregon).
Start Time: October 14, 2019 07:56 UTC
End Time: October 15, 2019 07:08 UTC
-SAP Cloud Platform Japan (Tokyo) [neo-jp1] – Service Advisory Since approximately 02:01 UTC until 03:12 UTC Lifecycle operations for HTML5 applications are not possible. Portal and WebIDE services were unavailable.
10/12/19 VMware Cloud Services Performance Degradation with VMware Log Intelligence Customers can expect to see log ingestion delays ranging from 30 to 90 minutes.
Start Time: October 11, 2019 21:38 UTC
End Time: October 12, 2019 02:15 UTC
10/11/19
–SAP Cloud Platform US East (Sterling) [neo-us3] – Service Advisory Since approximately 11:50 UTC until 13:38 UTC A general disruption is impacting the availability of applications and services
–SAP Cloud Platform Europe (Frankfurt) [cf-eu10] – Service Advisory Since approximately 05:18 UTC until 10:45 UTC Lifecycle management operations for Java applications cannot be executed. Customers may not be able to deploy, undeploy their application.
10/10/19
–VMware Cloud on AWS: SDDC provisioning failures in US East (N. Virginia) region New SDDC provisioning or adding hosts/clusters may fail in US West (Oregon).
Start Time: October 10, 2019 18:00 UTC
End Time: October 10, 2019 15:14 UTC
-SAP Cloud Platform Europe (Frankfurt) [neo-eu2] – Service Advisory Since approximately 07:30 UTC until 08:32 UTC Lifecycle management operations for Java applications cannot be executed. Customers may not be able to deploy, undeploy their application.
-SAP Cloud Platform Europe (Frankfurt) [neo-eu2] – Service Advisory Since approximately 05:38 UTC until 06:00 UTC Customers may not be able to deploy, undeploy their application. Lifecycle management operations for Java applications cannot be executed
-SAP Cloud Platform Singapore
[cf-ap11]
– Service Advisory Since approximately 00:39 UTC until 04:29 UTC Applications protected by Authorization & Trust Management (XSUAA) are not accessible
10/8/19 SAP Cloud Platform Japan (Tokyo) [neo-jp1] – Service Advisory Since approximately 13:24 UTC until 13:47 UTC A general disruption is impacting the availability of applications and services
10/7/19
–VMware Cloud PKS Performance Degradation issue Users might experience issues while creating cluster in US-East Region (North Virginia).
Start Time: October 07, 2019 04:14 UTC
End Time: October 07, 2019 05:06 UTC
-SAP Cloud Platform Brazil (São Paulo) [neo-br1] – Service Advisory Since approximately 12:03 UTC until 13:11 UTC Audit Log retrieval API is not available
10/5/19
–VMware Skyline Service Intermittent Availability issue User may not be able to access the service or may experience trouble while accessing the service.
Start Time: October 05, 2019 14:53 hours UTC
End Time: October 05, 2019 16:30 hours UTC
–VMware Cloud on AWS: Unable to access the Network and Security tab Customers will not be able to view Network and Security tab.
Start Time: October 05, 2019 04:20 UTC
End Time: October 05, 2019 11:20 UTC
–VMware Cloud on AWS Service Performance Degradation issue User may not be able to access the service or may experience trouble while accessing the service.
Start Time: October 05, 2019 01:00 UTC
End Time: October 05, 2019 01:34 UTC
10/4/19 VMware Skyline Service Intermittent Availability issue Some users may not be able to perform any operations on Skyline Advisor Dashboard.
Start Time: October 04, 2019 12:36 UTC
End Time: October 04, 2019 13:17 UTC
10/2/19 VMware Cloud on AWS: SDDC provisioning failures New SDDC provisioning or adding hosts/clusters may fail in US West (Oregon). This incident affected: VMware Cloud on AWS SDDC (US West (Oregon)).
Start Time: Tuesday, 1 October 2019, 22:28 UTC
End Time: Wednesday, 2 October 2019, 20:56 UTC