A Senior AI Infrastructure Engineer responsible for designing, optimizing, and scaling infrastructure to support real-time AI models and applications in healthcare, including large language models, computer vision, and voice AI, to ensure high performance and scalability for the AI Care platform.
Key Responsibilities
Own and optimize infrastructure for deploying AI models in production
Design systems for real-time computer vision, large language model serving, and low-latency voice interactions
Scale GPU clusters to support millions of AI sessions
Optimize inference performance for large language models and real-time video processing
Manage GPU workload orchestration to ensure models run at desired speed and scale
Requirements
Own the infrastructure that brings AI models to life in production, including optimizing LLM inference, deploying real-time voice AI agents, and scaling GPU clusters that serve millions of sessions.
Design systems that power real-time computer vision for movement analysis, serve large language models for conversational AI, and enable low-latency voice interactions for AI agents.
Deeply embedded in AI-specific challenges such as inference optimization, real-time video processing, model serving at scale, and GPU workload orchestration.
Passionate about pushing the boundaries of AI infrastructure performance.
Have a legal right to work in the United States (for US applicants) or possess a valid EU visa and be based in Portugal (for Portugal-based candidates).