Job Description
A Linux HPC Systems Engineer responsible for managing and maintaining high-performance computing clusters, ensuring system security, performance, and availability, primarily in classified environments requiring active security clearances.
Key Responsibilities
- Research, select, plan, and implement new technologies to support application scaling
- Install, configure, and maintain hardware, software, and system components across multiple platforms
- Administer and manage Linux systems, servers, workstations, and cluster computing environments
- Troubleshoot and resolve complex hardware, software, and network failures
- Perform system and user account maintenance and optimize system performance
- Implement security patches, hardening, and ensure compliance with standards and policies
- Manage resource allocation and job scheduling using Slurm and other cluster management software
- Oversee day-to-day administration of Linux-based cluster computing, storage, and network infrastructure
Requirements
- An active Secret U.S. Security Clearance.
- Willingness to obtain and maintain a Top Secret SCI clearance with polygraph.
- Minimum of 7 years of experience in a related role such as Linux Systems Administration or High Performance Computing (HPC).
- High School diploma or GED with 7 years of experience in a System Administration role or an equivalent combination of technical education and experience.
- Experience in Linux systems administration, including installing, configuring, and maintaining Linux servers and workstations.
- Demonstrated working knowledge of High Performance Computing (HPC) cluster architectures and concepts.
- Strong understanding of cluster management software such as HPE Performance Cluster Manager, Warewulf, Bright Cluster Manager, Aspen Cluster Management Environment, or OneSIS.
- Familiarity with job scheduling software, specifically Slurm.
- Experience with multiplatform environments including Linux and Windows.
- Experience with scripting for task automation using shell scripting, Python, or Perl.
- Experience with systems management and automation software such as Satellite, Foreman, or Ansible.
- Experience with high performance computing networking protocols such as InfiniBand.
- Experience with Red Hat Enterprise Linux 8 and 9, and Ubuntu Linux 20.04 and 22.04.
- Experience with system hardening practices such as DISA STIG and SCAP.
- Experience integrating Linux systems into an Active Directory environment.
- Experience with network fundamentals including switching, routing, firewalls, and load balancing.
- Experience with system monitoring and alerting tools such as Nagios.
- Ability to work independently or within a development team, effectively estimate time and effort, prioritize tasks, prepare project plans, and complete tasks and projects in a timely manner.
- Physical ability to occasionally stand, climb, stoop, kneel, crouch, or crawl, and regularly lift or move up to 50 pounds.
- Willingness to work on-site 100% of the time, including working evenings and weekends, sometimes with little to no advance notice.
- Willingness to travel up to 10% of the time.
- Possess or be willing to obtain a DoD 8570.01-M IAT Level II or higher certification such as CompTIA Security, CySA, GICSP, GSEC, or CISSP within 6 months of hire.
Benefits & Perks
Compensation salary range: 120,715 - 150,895 USD
Work on-site in Malibu, CA
Work schedule may include evenings, weekends, and occasional short-notice shifts
Up to 10 travel required
Bonus benefits (unspecified)
Ready to Apply?
Join HRL and make an impact in renewable energy
Stay Updated on Sustainability Jobs
Get the latest renewable energy jobs and career tips delivered to your inbox.
Job Alerts
Get notified about new sustainability jobs
More jobs at HRL
Infrared Test Manager
HRL
Camarillo
Full Time
Jan 9
$173k-222k
Semiconductor Yield Analyst
HRL
VISA
Malibu
Full Time
Dec 4
$194k-248k
Quantum Device Scientist
HRL
VISA
Malibu
Full Time
Dec 4
$141k-176k