Senior Machine Learning Engineer - LLM Evaluation / Task Creations (India Based)
## **Role Description**
Mercor is hiring on behalf of a leading AI research lab to bring on highly skilled **Machine Learning Engineers** with a proven record of building, training, and evaluating high-performance ML systems in real-world environments. In this role, you will design, implement, and curate high-quality machine learning datasets, tasks, and evaluation workflows that power the training and benchmarking of advanced AI systems.
- This position is ideal for engineers who have excelled in competitive machine learning settings such as Kaggle, possess deep modelling intuition, and can translate complex real-world problem statements into robust, well-structured ML pipelines and datasets. You will work closely with researchers and engineers to develop realistic ML problems, ensure dataset quality, and drive reproducible, high-impact experimentation.
- *Candidates should have 2+ years of applied ML experience or a strong record in competitive ML, and must be based in India.** Ideal applicants are proficient in Python, experienced in building reproducible pipelines, and familiar with benchmarking frameworks, scoring methodologies, and ML evaluation best practices.
- * *
## **Responsibilities**
- Frame unique ML problems for enhancing ML capabilities of LLMs.
- Design, build, and optimise machine learning models for classification, prediction, NLP, recommendation, or generative tasks.
- Run rapid experimentation cycles, evaluate model performance, and iterate continuously.
- Conduct advanced feature engineering and data preprocessing.
- Implement adversarial testing, model robustness checks, and bias evaluations.
- Fine-tune, evaluate, and deploy transformer-based models where necessary.
- Maintain clear documentation of datasets, experiments, and model decisions.
- Stay updated on the latest ML research, tools, and techniques to push modelling capabilities forward.
- * *
## **Required Qualifications**
- At least **2 years** of full-time experience in machine learning model development
- Technical degree in Computer Science, Electrical Engineering, Statistics, Mathematics, or a related field
- Demonstrated competitive machine learning experience (Kaggle, DrivenData, or equivalent)
- Evidence of top-tier performance in ML competitions (Kaggle medals, finalist placements, leaderboard rankings)
- Strong proficiency in **Python**, **PyTorch/TensorFlow**, and modern ML/NLP frameworks
- Solid understanding of ML fundamentals: statistics, optimisation, model evaluation, architectures
- Experience with distributed training, ML pipelines, and experiment tracking
- Strong problem-solving skills and algorithmic thinking
- Experience working with cloud environments (AWS/GCP/Azure)
- Exceptional analytical, communication, and interpersonal skills
- Ability to clearly explain modelling decisions, tradeoffs, and evaluation results
- Fluency in English
- * *
## **Preferred / Nice to Have**
- Kaggle **Grandmaster**, **Master**, or multiple **Gold Medals**
- Experience creating benchmarks, evaluations, or ML challenge problems
- Background in generative models, LLMs, or multimodal learning
- Experience with large-scale distributed training
- Prior experience in AI research, ML platforms, or infrastructure teams
- Contributions to technical blogs, open-source projects, or research publications
- Prior mentorship or technical leadership experience
- Published research papers (conference or journal)
- Experience with LLM fine-tuning, vector databases, or generative AI workflows
- Familiarity with MLOps tools: Weights & Biases, MLflow, Airflow, Docker, etc.
- Experience optimising inference performance and deploying models at scale
- * *
## **Why Join**
- Gain exposure to cutting-edge AI research workflows, collaborating closely with data scientists, ML engineers, and research leaders shaping next-generation AI systems.
- Work on high-impact machine learning challenges while experimenting with advanced modelling strategies, new analytical methods, and competition-grade validation techniques.
- Collaborate with world-class AI labs and technical teams operating at the frontier of forecasting, experimentation, tabular ML, and multimodal analytics.
- Flexible engagement options (**30–40 hrs/week or full-time**) — ideal for ML engineers eager to apply Kaggle-level problem solving to real-world, production-grade AI systems.
Apply tot his job
Apply To this Job