At about 14:00 PST (-08:00) our engineers noticed that the monitoring daemon on our master (gw00) load balancer that checks whether slices are alive stopped functioning properly. It appeared to be up but was not detecting slice failures properly. We failed over to our slave (gw01) load balancer at 14:05 PST (-08:00) so we could diagnose the problem with the master.

After the failover a misconfiguration on gw01 prevented internal DNS queries from working properly but the exact issue was not immediately apparent. We noticed sites becoming unavailable shortly after and our engineers kept reviewing the issue. We rebooted gw00 and failed over gw01 to gw00 to get sites back online. This happened at 14:35 PST (-08:00). We received the OK for all sites in the next couple of minutes.

Our investigation is still ongoing. If you are experiencing any issues please let us know via phone, ticket or IRC. Again, we apologise for the downtime and inconvenience caused.