A Site Reliability Engineer at Canonical is responsible for deploying, managing, and automating enterprise infrastructure using open source technologies like OpenStack and Kubernetes, ensuring high availability and performance for mission-critical cloud services in a remote, globally distributed team.
Key Responsibilities
Deploy and operate OpenStack, Kubernetes, storage solutions, and open source applications using DevOps practices
Identify, address, and monitor incidents in mission-critical services
Automate operations through software engineering, focusing on metrics and code-driven solutions
Manage and maintain infrastructure across physical, private, and public cloud environments
Work with full open source infrastructure stack from bare metal to containers
Collaborate with teams to refine product performance and reliability
Requirements
Degree in software engineering or computer science
Python software development experience
Operational experience in Linux environments
Experience with Kubernetes deployment or operations
Ability to work in operations with mission-critical services for global brand-name customers
Genuine interest in the full open source infrastructure stack from bare metal to containers
Ability to work in a distributed, remote environment
Willingness and ability to travel internationally twice a year for company events, for up to two weeks each time
Benefits & Perks
Compensation is adjusted every 6 months based on performance, with additional annual bonuses
Distributed work environment with remote work
Personal learning and development budget of USD 2,000 per year
Every 6 months compensation review
Recognition rewards
Annual holiday leave
Maternity and paternity leave
Employee Assistance Programs
Opportunity to travel to new locations to meet colleagues
Priority Pass and travel upgrades for long-haul company events
Ready to Apply?
Join Canonical and make an impact in renewable energy