Microsoft reports the cause of the failure of “Azure AD”

Microsoft released preliminary cause analysis results for failures that occurred in “Azure Active Directory” (Azure AD) on March 15th. The failure prevented Microsoft and third-party applications that rely on Azure AD for authentication, such as Office, Teams, Dynamics 365, and Xbox Live. The company says the last 14 hours of failure have affected “some” Microsoft customers around the world.

Provided by: According to the preliminary analysis results of this incident published on March 16 on the Microsoft “Azure State History” page, “Azure AD performs cryptographic signature operations using standard protocols for IDs such as OpenID. There was a problem with the rotation of the keys used to support it. ” Azure AD automatically removes obsolete keys as part of its normal security work, but for the past few weeks, it has been “holding” on one key to support complex inter-cloud migrations. The mark was given. According to Microsoft, this revealed a specific bug that removed the “held” key. As a result, Azure AD has released metadata about this signing key to the world. When this metadata changed at 3:00 pm EST, the time of the failure, Azure AD began receiving this new metadata and no longer trusted tokens and assertions that were signed with the deleted key. .. Microsoft engineers rolled back the system around 5 pm Eastern Standard Time, but it takes time for the application to receive the rolled-back metadata and start using the correct metadata. Some storage resources were affected for a long time, so it was necessary to invalidate the illegal entry and perform an update to force an update. The report states that Azure AD is working on a multi-phase effort to increase the protection of the back-end “Safe Deployment Process” to prevent this type of problem. The key removal component is the second phase of the work, but was expected to be completed in mid-2021. In the report, “We understand that this situation was extremely impactful and unacceptable, and we deeply apologize. To prevent this from happening in the future, Microsoft Azure Platform and our business processes We will continue to take measures to improve the situation. ” The final analysis results will be published when the survey is completed.