Lead / Senior QA Engineer – with Temporal and LLM, Langfuse
- Dice is the leading career destination for tech experts at every stage of their careers. Our client, Vertical Falls LLC, is seeking the following. Apply via Dice today!
- *Job Title: Lead / Senior QA Engineer – Agentic AI Systems WITH Langfuse , Temporal
100% Remote
Interview Mode:2 Video**
6-12 MONTHS CONTRACT
We are looking for a highly skilled QA professional to build and scale a next-generation
Agentic AI Quality Engineering function
. This role goes beyond traditional QA—focusing on validating autonomous AI systems, designing evaluation frameworks, and ensuring high-quality outputs across multiple AI-driven products.
You will play a critical role in shaping how quality is defined, measured, and improved for agentic systems that operate with minimal human intervention.
- Key Responsibilities
- Agentic QA Strategy & Scaling
- Design and scale an agentic QA model for autonomous AI systems
- Move QA from human-driven validation to AI-led evaluation and continuous quality monitoring
- Establish best practices for testing AI agents across lifecycle stages
- Product Quality Ownership
- Own QA for 3 core AI products:
- AI Contact Center solutions
- AI Chat & Form-based interaction systems
- AI Assistants (autonomous / semi-autonomous agents)
- Define quality benchmarks, SLAs, and success metrics for each product
- Proactively identify quality gaps ahead of customer impact
- Metrics, Observability & Evaluation
- Define and track performance outputs for agentic systems (accuracy, latency, resolution quality, hallucination rate, etc.)
- Build frameworks for:
- Evals & graders (LLM evaluation pipelines)
- Output scoring and benchmarking
- Continuous feedback loops
- Leverage tools like Langfuse for:
- LLM observability and tracing
- Prompt monitoring and performance analysis
- Debugging agent behavior in production
- Analyze:
- Downstream issues
- Production tickets
- Failure patterns
- Automation & Testing Frameworks
- Build and scale automation across:
- Regression testing
- Smoke testing
- End-to-end agent workflows
- Develop and maintain Playwright-based automation scripts
- Integrate QA into CI/CD pipelines for continuous validation
- Agentic Testing & Validation
- Design testing approaches for:
- Multi-step agent workflows
- Context retention and reasoning
- Tool usage by agents
- Work with orchestration frameworks like Temporal to:
- Validate long-running workflows
- Test retries, state transitions, and failure handling in agent pipelines
- Account for non-deterministic behavior in AI systems
- Invest additional effort in agentic validation, recognizing higher complexity vs traditional QA
- Continuous Improvement & Innovation
- Define frameworks to predict and prevent failures before customer exposure
- Continuously improve QA processes using AI and automation
- Partner with Product, Engineering, and AI teams to improve system quality
- Required Skills & Experience
- 5–10+ years in QA / Quality Engineering, with strong automation experience
- Hands-on experience with:
- Test automation tools (Playwright preferred)
- API and system testing
- Strong understanding of:
- AI/ML systems (LLMs, conversational AI preferred)
- Evaluation frameworks and benchmarking
- Experience with:
- Temporal (workflow orchestration, stateful systems testing)
- Langfuse (LLM observability, tracing, and evaluation)
- Experience in:
- Building QA frameworks from scratch
- Working with production data, logs, and issue triaging
- Good to Have
- Experience with LLM eval frameworks, prompt testing, or AI red-teaming
- Familiarity with agentic architectures / autonomous systems
- Exposure to observability and analytics platforms
- Working Model
- Prefer candidates with EST time zone overlap
- Ability to work closely with global product and engineering teams
- What Success Looks Like
- A scalable, automated QA system for agentic products
- Measurable improvement in AI output quality and reliability
- Reduced production issues and faster detection of failures
QA evolving from reactive testing to
proactive quality intelligence
Apply tot his job
Apply To this Job