As an SSD Reliability Validation Engineer, you will design and execute reliability and stress test plans for enterprise SSDs, with a primary focus on:
RDT campaigns to demonstrate lifetime reliability and field-equivalent stress.
4-corner validation across temp / voltage / workload / media stress.
Customer-mode validation, including customer-specific feature modes, power / perf limits, and telemetry/OCP profiles.
You will work closely with NAND, firmware, hardware, systems, and manufacturing teams to define coverage, execute tests in automated environments, analyze results, and communicate clear recommendations on SSD readiness and risk.
Own end-to-end RDT and reliability demonstration plans for new SSDs, including workloads, sample plans, stress levels, and pass/fail criteria.
Plan and execute 4-corner and stress validation across voltage, temperature, workload, and media/background operations.
Translate customer requirements and modes into concrete reliability and stress tests, and provide clear readiness/risk readouts.
Develop and automate test content, harnesses, and CI/regression integration in Python/Linux-based environments.
Analyze logs and telemetry to debug issues, drive JIRA closure, and partner with NAND/FW/HW/systems/analytics teams on fixes and dashboards.
Utilize reliability statistics and acceleration models (e.g., Weibull) to architect sample plans, determine MTBF metrics, and define data-centric qualification thresholds.
Ensure RDT methodologies and workload profiles mirror industry-standard (e.g., JEDEC) benchmarks and real-world customer deployment scenarios.
Requirements
5+ years of experience in SSD, storage, or hardware reliability / validation, ideally with enterprise or hyperscale products.
Strong understanding of NAND flash fundamentals (P/E cycling, wear-out, read-disturb, retention, ECC, PLP/holdup) and how they map into reliability tests.
Hands-on experience with RDT / ALT / ORT or equivalent reliability demonstration programs for SSDs or similar embedded products.
Proficient in reliability statistics and acceleration modeling for lifetime projections, with practical experience leveraging data science libraries and stat.
Proficient in JEDEC specifications, OCP datacenter storage profiles, and NVMe architectural requirements. Expert-level knowledge of industry-standard qualification and compliance frameworks,
Practical experience designing and executing 4-corner or environmental stress tests (voltage, temperature, workload corners).
Familiarity with NVMe / PCIe concepts (basic command flows, admin vs I/O path, error reporting, SMART/Telemetry, OCP extensions).
Strong Python (or similar) skills for test and tooling development; comfortable working in Linux-based lab environments.
Experience using test automation and CI (e.g., Jenkins, lab frameworks) to run large-scale, long-running test campaigns.
Solid data analysis skills to interpret error logs, telemetry, and large test datasets; able to turn data into clear engineering conclusions.
#LI-ONSITE
Salary ranges are determined based on role, level and location. For positions open to candidates in multiple geographical locations, the base salary range is reflective of the labor market across the applicable locations.
This role may be eligible for incentive pay and/or equity.
There is no application deadline and we accept applications on an ongoing basis until the job is filled.
Ready to Apply?
Join Pure Storage and make an impact in renewable energy