Google's SRE Book on Eliminating Toil

For everything that Google does which I have strong opinions about, some of their SRE training and information is pure gold. I just finished reading the chapter on eliminating toil and it really hits home for me.

If a human operator needs to touch your system during normal operations, you have a bug. The definition of normal changes as your systems grow.

  • Carla Geisser, Google SRE

So, the more time you spend keeping things running, the less time you have to make them better - for you and your end users.

I won’t copy the rest of the article out, I’d highly recommend to read it and rest of the SRE book. (I really need to take my own advice)

