The role involves designing and implementing global infrastructure solutions, automating operations, and ensuring reliability and security for Canonical's internal cloud and open source platforms, primarily supporting Ubuntu and related technologies.
Key Responsibilities
Define and implement a holistic vision for world-class internal cloud infrastructure
Develop and maintain technical design roadmaps and guidelines to improve reliability, resilience, and scalability
Collaborate with cloud-ops software development teams on roadmap, requirements, and automation initiatives
Advise IS management on technology choices, reliability, resilience, and business cases
Lead technical decisions to develop scalable, self-service solutions
Work with security teams to establish best practices and mitigate threats
Automate operations for large-scale distributed systems
Design service architecture, documentation, and operational procedures in collaboration with development teams
Analyze incidents to identify root causes and implement structural improvements
Requirements
Exceptional academic track record from both high school and university
Undergraduate degree in a technical subject or a compelling narrative about your alternative chosen path
Confidence to respectfully speak up, exchange feedback, and share ideas without hesitation
Track record of going above-and-beyond expectations to achieve outstanding results
Extensive knowledge of cloud computing concepts, technologies, and operations
Practical knowledge of Linux networking, routing, and firewalls, internet transit, and large-scale bandwidth networking
Experience dealing with significant production outages, incident response, and postmortems
A passion for writing, sharing, and maintaining enterprise open-source software solutions
Ability to communicate clearly and effectively in English over email, chat, video or voice calls, and in-person
Familiarity and passion for open-source, especially Ubuntu or Debian
Willingness to travel twice annually for company events, totaling around 4 weeks per year
Ability to define, get buy-in, and implement a holistic vision of a world-class internal cloud
Ability to set up, maintain, and update the technical design roadmap and guidelines for SREs within IS to improve reliability, resilience, operational scalability, and technical scalability
Ability to collaborate with cloud-ops software development teams to provide input for roadmap, requirements, and prioritization
Ability to provide the IS management with input and advice regarding technology, reliability, resilience, and business cases
Capability to lead technical choices to implement solutions as self-service products ensuring scalable operation
Ability to collaborate with product security and operations security teams to set best practices and mitigate threats in a timely manner
Skill to automate operations considering the complexities of distributed systems for reuse across large companies
Ability to collaborate with development teams to design service architecture, documentation, playbooks, policies, and operational procedures
Skill to analyze incidents and events to identify root causes and structural improvements to minimize recurrence
Benefits & Perks
Competitive compensation with annual review and performance-driven bonus or commission
Home-based position with twice-annual travel to company events totaling around 4 weeks per year
Distributed work environment with in-person team sprints
Personal learning and development budget of USD 2,000 per year
Annual holiday leave
Maternity and paternity leave
Employee Assistance Program
Opportunity to travel to new locations to meet colleagues
Priority Pass and travel upgrades for long haul company events
Ready to Apply?
Join Canonical and make an impact in renewable energy