Lead / Senior QA Engineer – with Temporal and LLM, Langfuse

Remote, USA Full-time Posted 2026-05-31
Apply Now
    Dice is the leading career destination for tech experts at every stage of their careers. Our client, Vertical Falls LLC, is seeking the following. Apply via Dice today!
  • *Job Title: Lead / Senior QA Engineer – Agentic AI Systems WITH Langfuse , Temporal

100% Remote

Interview Mode:2 Video**
6-12 MONTHS CONTRACT
We are looking for a highly skilled QA professional to build and scale a next-generation
Agentic AI Quality Engineering function
. This role goes beyond traditional QA—focusing on validating autonomous AI systems, designing evaluation frameworks, and ensuring high-quality outputs across multiple AI-driven products.

You will play a critical role in shaping how quality is defined, measured, and improved for agentic systems that operate with minimal human intervention.

    Key Responsibilities
  • Agentic QA Strategy & Scaling
  • Design and scale an agentic QA model for autonomous AI systems
  • Move QA from human-driven validation to AI-led evaluation and continuous quality monitoring
  • Establish best practices for testing AI agents across lifecycle stages
  • Product Quality Ownership
    Own QA for 3 core AI products:
  • AI Contact Center solutions
  • AI Chat & Form-based interaction systems
  • AI Assistants (autonomous / semi-autonomous agents)
  • Define quality benchmarks, SLAs, and success metrics for each product
  • Proactively identify quality gaps ahead of customer impact
  • Metrics, Observability & Evaluation
  • Define and track performance outputs for agentic systems (accuracy, latency, resolution quality, hallucination rate, etc.)
  • Build frameworks for:
  • Evals & graders (LLM evaluation pipelines)
  • Output scoring and benchmarking
  • Continuous feedback loops
  • Leverage tools like Langfuse for:
  • LLM observability and tracing
  • Prompt monitoring and performance analysis
  • Debugging agent behavior in production
  • Analyze:
  • Downstream issues
  • Production tickets
  • Failure patterns
  • Automation & Testing Frameworks
  • Build and scale automation across:
  • Regression testing
  • Smoke testing
  • End-to-end agent workflows
  • Develop and maintain Playwright-based automation scripts
  • Integrate QA into CI/CD pipelines for continuous validation
  • Agentic Testing & Validation
  • Design testing approaches for:
  • Multi-step agent workflows
  • Context retention and reasoning
  • Tool usage by agents
  • Work with orchestration frameworks like Temporal to:
  • Validate long-running workflows
  • Test retries, state transitions, and failure handling in agent pipelines
  • Account for non-deterministic behavior in AI systems
  • Invest additional effort in agentic validation, recognizing higher complexity vs traditional QA
  • Continuous Improvement & Innovation
  • Define frameworks to predict and prevent failures before customer exposure
  • Continuously improve QA processes using AI and automation
  • Partner with Product, Engineering, and AI teams to improve system quality
    Required Skills & Experience
  • 5–10+ years in QA / Quality Engineering, with strong automation experience
  • Hands-on experience with:
  • Test automation tools (Playwright preferred)
  • API and system testing
  • Strong understanding of:
  • AI/ML systems (LLMs, conversational AI preferred)
  • Evaluation frameworks and benchmarking
  • Experience with:
  • Temporal (workflow orchestration, stateful systems testing)
  • Langfuse (LLM observability, tracing, and evaluation)
  • Experience in:
  • Building QA frameworks from scratch
  • Working with production data, logs, and issue triaging
    Good to Have
  • Experience with LLM eval frameworks, prompt testing, or AI red-teaming
  • Familiarity with agentic architectures / autonomous systems
  • Exposure to observability and analytics platforms
    Working Model
  • Prefer candidates with EST time zone overlap
  • Ability to work closely with global product and engineering teams
    What Success Looks Like
  • A scalable, automated QA system for agentic products
  • Measurable improvement in AI output quality and reliability
  • Reduced production issues and faster detection of failures

QA evolving from reactive testing to
proactive quality intelligence

Apply tot his job

Apply To this Job

Similar Jobs