The role involves developing and validating high-performance storage solutions by designing automation frameworks, stress-testing firmware and hardware, analyzing telemetry data, and collaborating across teams to ensure enterprise reliability of NAND flash storage devices.
Key Responsibilities
Design and scale automated qualification frameworks using Python and C to test drive firmware stability.
Develop automated workflows to stress-test NAND firmware features such as media management, wear leveling, and ECC.
Lead root cause analysis of complex firmware and hardware failures using logs and telemetry data.
Build and maintain data pipelines to collect system logs and performance metrics for validation and sign-off.
Continuously improve qualification methodologies and automation to enhance test coverage and efficiency.
Requirements
Deep understanding of NAND flash fundamentals including program erase behavior, error mechanisms, data retention, and SSD firmware architecture such as FTL and garbage collection.
Advanced development skills in Python and C, with a proven ability to build robust automation frameworks, backend systems, or platform infrastructure.
Hands-on experience with NVMe tooling and storage protocols, and a track record of qualifying storage devices such as SSDs, HDDs, and DFM through early silicon bring-up and stress testing.
Exceptional ability to analyze complex traces and telemetry to root-cause firmware and hardware interactions, and strong communication skills to drive cross-functional solutions.
Experience designing and automating qualification methods that increase test coverage and quality indicators within a CI/CD or large-scale hardware firmware environment.
Minimum of 3 years of experience in storage systems, firmware validation, or related fields.
Proficiency in developing and scaling Python and C test infrastructures that orchestrate drive-level qualification, ensuring firmware stability across performance, endurance, and reliability workstreams.
Ability to develop automated workflows to stress-test critical features including media management, wear leveling, and ECC behavior, and translate raw validation data into product-readiness dashboards.
Experience leading technical triage of complex failures across NAND and firmware layers, utilizing logs and telemetry to isolate issues and partnering with architecture teams to implement permanent fixes.
Experience building and maintaining high-integrity data collection pipelines that capture system logs and performance metrics for system validation and final GA sign-off.
Ability to continuously improve qualification methodologies and coverage metrics to keep pace with evolving controller architectures and NAND generations, reducing manual intervention through automation.
In-depth knowledge of SSD firmware architecture, including FTL and garbage collection, and understanding of NAND flash fundamentals such as program/erase behavior, error mechanisms, and data retention.
Proficiency in protocol validation with NVMe tooling and storage protocols, including experience qualifying storage devices through silicon bring-up and stress testing.
Physical ability and willingness to work on-site at the Santa Clara, CA office in compliance with company policies.
Benefits & Perks
Salary range: 180,000 - 270,000 USD annually
Work environment: primarily in-office at Santa Clara, CA
Work schedule: flexible time off
Perks: wellness resources, company-sponsored team events
Additional benefits: incentive pay and/or equity, support for growth and development, inclusive and diverse workplace culture
Ready to Apply?
Join Pure Storage and make an impact in renewable energy