292
Cloudflare blames massive internet outage on 'latent bug'
(techcrunch.com)
This is a most excellent place for technology news and articles.
Not really. Sometimes there are processes designed where engineers will make a change as a reaction or in preparation for something. They could have easily made a mistake when making a change like that.
E.g.: companies that advertise on a large sporting event might preemptively scale up (maybe warm up depending on language) their servers in preparation for a large load increase following some ad or mention of a coupon or promo code. Failure to capture the market it could generate would be seen as wasted $$$
Edit: auto-scale does not count on non essential products, people would not come back if the website failed to load on the first attempt.
I don't think it was a bug making the configuration change, I think there was a bug as a result of that change.
That specific combination of changes may not have been tested, or applied in production for months, and it just happened to happen today when they were needed for the first time since an update some time ago, hence the latent part.
But they do changes like that routinely.
Yeah, I just read the postmortem. My response was more about the confusion that any configuration change is inherently non-routine.