================ Incident reports ================ Blameless incident reports are very important for long term sustainability of resilient infrastructure. We publish them here for transparency, and so we may learn from them for future incidents. .. toctree:: :maxdepth: 1 2017-02-09-datahub-db-outage 2017-02-24-autoscaler-incident 2017-02-24-proxy-death-incident 2017-03-06-helm-config-image-mismatch 2017-03-20-too-many-volumes 2017-03-23-kernel-deaths-incident 2017-04-03-cluster-full-incident 2017-05-09-gce-billing 2017-10-10-hung-nodes 2017-10-19-course-subscription-canceled 2018-01-25-helm-chart-upgrade 2018-01-26-hub-slow-startup 2018-02-06-hub-db-dir 2018-02-28-hung-node 2018-06-11-course-subscription-canceled 2019-02-25-k8s-api-server-down 2019-05-01-service-account-leak