A leadership role overseeing reliability, scalability, and operational excellence for Pure Storage Cloud, involving managing SRE and platform teams, developing automation and cloud infrastructure, and ensuring high availability and performance of cloud services.
Key Responsibilities
Lead and develop SRE and Platform teams, setting strategy for reliability, scalability, and operability of the cloud platform
Own reliability engineering by defining and evolving SLIs, SLOs, error budgets, incident response, change management, and runbooks
Build and operate internal platform tooling for modern developer workflows, including CI/CD, observability, telemetry, and automation
Operate and harden core cloud infrastructure such as Kubernetes and IaC across control and data planes
Lead capacity planning, cost optimization, disaster recovery, and multi-region readiness
Proven leadership experience running SRE, Production Engineering, and Platform functions for SaaS or cloud services at scale, building high performance, inclusive teams.
Hands-on software development experience with fluency in engineering fundamentals, including design reviews, automated testing, CI/CD, and version control, with the ability to contribute to production-grade code.
Deep understanding of SRE foundations, including defining and evolving SLIs, SLOs, error budgets, incident management, capacity planning, change release management, and reliability reviews.
Practical cloud expertise, with a preference for Azure, including experience with modern SRE toolchain components such as containers, Kubernetes, Infrastructure as Code (Terraform, Bicep, CloudFormation), CI/CD pipelines, and observability tools like OpenTelemetry, Prometheus, Grafana, ELK, and Azure Monitor.
Strong systems thinking and architectural acumen, including resilience reviews, failure mode analysis, chaos engineering, disaster recovery testing, and data-driven stakeholder communication.
Experience operating and hardening core cloud infrastructure services, including Kubernetes and Infrastructure as Code (IaC) across control and data planes.
Ability to lead capacity planning, cost optimization, disaster recovery, and multi-region readiness for cloud services.
Work in an in-office environment in Prague, Czech Republic, with the expectation to work from the Prague office unless on PTO, work travel, or other approved leave.
Benefits & Perks
Relocation package
Flexible time off
Wellness resources
Company-sponsored team events
Ready to Apply?
Join Pure Storage and make an impact in renewable energy