[Hpcresilience] Netflix uncages Chaos Monkey disaster testing system

Debardeleben, Nathan A ndebard at lanl.gov
Tue Jul 31 10:28:59 MDT 2012


Resilience folks will likely find this interesting.  Netflix had previously given some talks I had seen where they spoke of an OS that randomly killed developer tasks as a way of forcing their developers to think in terms of unreliable systems.  Apparently, here it is (and available for use).

-- Nathan

  Nathan DeBardeleben, Ph.D.
  Los Alamos National Laboratory
  High Perf. Computing Systems Integration (HPC-5)
  Ultra-Scale Research Center, Resilience Lead
  phone: 505-412-1069
  email: ndebard at lanl.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://rfd.newmexicoconsortium.org/pipermail/hpcresilience/attachments/20120731/da9221b0/attachment.html>

More information about the Hpcresilience mailing list