
Bryan Oliver
He / him / his
Principal @Thoughtworks, Global Speaker, Co-Author of "Effective Platform Engineering" and "Designing Cloud Native Delivery Systems"
Bryan is an engineer who designs and builds complex distributed systems. For the last 3 years, he has been focused on Platforms, GPU Infrastructure, and cloud native at Thoughtworks. Through his work, he gets invited to speak at conferences all over the globe. He's also a multi-published author with Manning, Effective Platform Engineering, and an early access book with O'Reilly, Designing Cloud Native Delivery Systems.
Session
Chaos Engineering GPU Clusters
We are used to the concepts of fault injection and chaos engineering in normal clusters and web api services. Techniques like node shutdowns, cpu exhaustion, memory leaks, etc. are all easy things to automate in Kubernetes with open source or proprietary tools.