A Machine Learning Engineer role focused on developing AI-driven GPU kernel optimizations to improve performance and reduce inference latency for self-driving technology systems at Nuro.
Key Responsibilities
Implement AI-driven GPU kernel optimization methods to improve program efficiency and performance.
Develop strategies for resource-efficient deployment of AI-based optimization processes.
Guide high-level optimizations, including neural architecture search, to enhance training efficiency and inference latency.
Assess performance improvements using evaluation metrics and real-world feedback.
Collaborate with internal teams to benchmark optimization processes and strategies.
Utilize leaderboard scoring systems to evaluate AI models' ability to generate efficient GPU kernels.
Requirements
Bachelor's or Master's Degree in Computer Science, Engineering, or a related field.
Extensive experience with AI models, including but not limited to Large Language Models (LLMs).
Strong understanding of retrieval-augmented generation (RAG) and Language Models (LLMs).
Ability to self-motivate, undertake complex assignments independently, and balance innovative exploration with practical considerations.
Benefits & Perks
Compensation/salary range between $193,930 and $352,290 depending on experience and qualifications