• A PhD in Computer Science, Machine Learning, Natural Language Processing, Computer Vision, or a closely related AI field;
• Hands-on experience fine-tuning large language models or multimodal large models (e.g., vision-language models, speech-language models), including pre-training, SFT, RLHF, or related post-training techniques;
• Experience training or fine-tuning models that operate across multiple modalities (e.g., video + language, image + text, speech + text);
• A strong publication track record in peer-reviewed AI conferences or journals;
• Proficiency in Python and deep experience with modern ML frameworks (e.g., PyTorch, JAX);
• Demonstrated ability to design rigorous experiments and interpret their results.