Figure 1a shows how incidents happened substantially less on Saturday and Sunday even though traffic to the site remains consistent throughout the week. Figure 1b shows a six-month period during which there were only two weeks with no incidents: the week of Christmas and the week when employees are expected to write peer reviews for each other.
These two data points seem to suggest that when Facebook employees are not actively making changes to infrastructure because they are busy with other things (weekends, holidays, or even performance reviews), the site experiences higher levels of reliability.
(seen here)
ununnilium liked this
schpeelah liked this
lupin5th liked this
tchtchtchtchtch liked this nostalgebraist liked this
stumpyjoepete reblogged this from nostalgebraist and added:
I work for {redacted}, which is a company similar to but distinct from fb. Here’s my guess for how things work at fb:...
poipoipoi-2016 reblogged this from nostalgebraist and added:
The way I’m reading this is Continuous Delivery and/or (Near-)Continuous Deployment (after truly continuous deployment...
crankyfacedknitter liked this
snarp liked this
phenoct liked this
kelsbraintumbler liked this
nostalgebraist reblogged this from stumpyjoepete and added:
There’s also the question of what your error budget should be, once we’ve decided it should be nonzero (which it...
puddleofchaos reblogged this from mumblingsage
melinda-t-charville liked this
mumblingsage reblogged this from nostalgebraist
guavaorb reblogged this from nostalgebraist
maybesimon liked this molibdenita liked this
rangi42 liked this
eudaemaniacal liked this
ireneae reblogged this from nostalgebraist
digging-holes-in-the-river liked this
lyycernment liked this eka-mark liked this
kerapace liked this
more-whales liked this
