GitHub Availability Report: October 2021

Post Syndicated from Scott Sanders original https://github.blog/2021-11-04-github-availability-report-october-2021/

In October, we experienced one incident resulting in significant impact and degraded state of availability for the GitHub Codespaces service.

October 8 17:16 UTC (lasting 1 hour and 36 minutes)

A core Codespaces API response was inadvertently restructured as part of our Codespaces public API launch, impacting existing API clients dependent on a stable schema.

For the duration of the incident, new Codespaces could not be initiated from the Visual Studio Code Desktop client. Connections to the web editor and pre-existing desktop sessions were not impacted, but degraded, with the extension displaying an error message while omitting Codespaces metadata from the Remote Explorer view.

The incident was mitigated once we rolled back the regression, at which point all clients could connect again, including with new Codespaces created during the incident. As our monitoring systems did not initially detect the impact of the regression, a subsequent and unrelated deployment was initiated, delaying our ability to revert the change. To ensure similar breaking changes are not introduced in the future, we are investing in tooling to support more rigorous end-to-end testing with the extension’s use of our API. Additionally, we are expanding our monitoring to better align with the user experience across the relevant internal service boundaries.

In summary

We will continue to keep you updated on the progress and investments we’re making to ensure the reliability of our services. To learn more about what we’re working on, check out the GitHub engineering blog.