Senior Staff Machine Learning Engineer, Data & Eval
- Job Description:
- In this Senior Staff role, you will set technical direction and lead execution for ML evaluation and the end-to-end data flywheel powering CSxAI products (e.g., assistive agents, issue resolution, and tooling).
- Your work will define how we measure quality, how we turn feedback into learning signals, and how we continuously improve models and products safely and efficiently.
- You will partner closely with product, engineering, design, operations to build evaluation systems that are trusted, scalable, and actionable - connecting offline metrics to online outcomes.
- Work with large scale structured and unstructured data; explore, experiment, build and continuously improve Machine Learning models and pipelines for Airbnb product, business and operational use cases.
- Work collaboratively with cross-functional partners including product managers, operations and data scientists, to identify opportunities for business impact; understand, refine, and prioritize requirements for machine learning, and drive engineering decisions.
- Hands-on develop, productionize, and operate Machine Learning models and pipelines at scale, including both batch and real-time use cases.
- Leverage third-party and in-house Machine Learning tools & infrastructure to develop reusable, highly differentiating and high-performing Machine Learning systems, enable fast model development, low-latency serving and ease of model quality upkeep.
- Requirements:
- Educational Background: PhD in Computer Science, Mathematics, Statistics, or related technical field (or equivalent practical experience).
- Industry Experience: 10+ years building, testing, and shipping ML/AI systems end-to-end; including 2+ years of experience with GenAI/LLM systems in production.
- Leadership Experience: 5+ years leading large, ambiguous technical initiatives as a senior IC, influencing roadmap and engineering/science direction across teams.
- Technical Proficiency:
- Deep expertise in evaluation methodology (offline/online alignment, metric design, human-in-the-loop evaluation, A/B testing, power analysis, regression testing).
- Hands-on experience with GenAI systems, including orchestration, retrieval, tool calling, memory, etc.
- Experience building data pipelines and quality systems (labeling workflows, dataset curation, versioning, monitoring, and governance).
- Solid ML fundamentals and best practices (model selection, training/serving, monitoring, reliability, and model lifecycle management).
- Benefits:
- This role may also be eligible for bonus, equity, benefits, and Employee Travel Credits.
Apply tot his job
Apply To this Job