Software Development Engineer (Agentic AI & LLM Platforms) – Work Remotely (EST Hours) – Must Be Able to Obtain Public Trust – No 3rd Parties
Must be able to obtain a Public Trust
Must be able to work remote EST hours
- Overview
- We are building the next generation of agentic AI to transform how the agency accelerates research, makes decisions, and ships products at scale.
- We are a small, startup-minded team that ships fast and owns what we build end-to-end.
- We are looking for an SDE II who is hungry to contribute to a real production system, not a sandbox.
- You will work across the application and infrastructure layers, implement features that users interact with every day, and be expected to own what you build from design through deployment.
- You will not be handed perfectly scoped tickets.
- You will be expected to ask good questions, figure things out, and move.
- The best person for this role communicates clearly, collaborates without ego, and brings genuine empathy for the users whose work they are making better.
- You are a self-starter with a high bar and a high sense of urgency.
- You play well with others and make the people around you better.
What You Will Do
- Build Agentic AI Systems
- Implement and iterate on our agentic workflows: tool-calling, multi-step reasoning, planning, memory, and agent-to-agent (A2A) communication patterns at the application layer
- Build and maintain MCP (Model Context Protocol) client-side integrations: how agents discover, invoke, and compose tools
- Implement tool definitions, input/output schemas, error handling, retry logic, and result formatting for GRACE's growing tool library
- Contribute to multi-agent orchestration patterns that are reliable and debuggable in production, not just in demos
- Build LLM-Powered Features
- Implement LLM orchestration logic: prompt construction, context management, model selection, and response parsing across OpenAI GPT, Anthropic Claude, and Google Gemini
- Build and maintain RAG pipeline components: query formulation, result ranking, citation grounding, and hallucination mitigation
- Implement and iterate on prompt engineering patterns and system prompts that drive quality and consistency across model families
- Contribute to context window budget management: truncation, summarization, and pagination logic that makes the right call at runtime
- Build LLM evaluation components: grounding assessment, regression tests, safety checks, and quality metrics
- Write prompts and pipelines with token economics in mind; cost-per-query is a real constraint, not an afterthought
- Own the Backend
- Build secure, well-tested backend features end-to-end: from application logic through to the API contract the frontend consumes
- Implement integrations with internal and external data sources and APIs, including Dimensions, Google Search, Slack, SharePoint, and LLM provider APIs
- Contribute to monitoring, logging, and distributed tracing so that failures are diagnosable and regressions are caught before users report them
- Implement fallback, retry, and graceful degradation patterns for AI service dependencies
- Write production-quality code: readable, tested, reviewed, and documented
- Contribute to Infrastructure
- Work within Microsoft Azure infrastructure: Azure Functions, Azure API Management, Azure Container Apps, and Azure OpenAI Service
- Contribute to CI/CD pipelines, deployment automation, and release processes
- Work with containerization tools and infrastructure as code; understand the environment your code runs in
- Contribute to application-level SLOs: tool call success rates, response quality, and latency from the user's perspective
- Collaborate and Grow
- Participate actively in design reviews, sprint planning, and retrospectives; ask good questions and push back when something does not add up
- Communicate technical decisions clearly to both engineers and non-engineers; no one should have to guess what you built or why
- Work closely with the PM, researcher, designer, and senior engineers to translate ambiguous requirements into clear, actionable implementations
- Bring genuine curiosity and empathy to every feature; understand who is using what you build and why it matters to them
- Ensure strong privacy, security, and compliance in all systems, integrations, and data handling
- Basic Qualifications
- Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field, or equivalent practical experience
- 3+ years of professional software engineering experience building and operating production systems
- Proven experience in high-velocity environments where you contributed to shipping real products end-to-end
- Strong proficiency in Python and at least one other backend language; familiarity with modern backend frameworks and async patterns
- Solid understanding of algorithms, data structures, distributed systems, and software design patterns
- Experience building and operating systems on major cloud platforms (AWS, GCP, or Azure)
- Experience with containerization (Docker) and working within CI/CD pipelines
- Clear, direct communicator who gives and receives feedback well, works with empathy, and makes the people around them better
- Preferred Qualifications
- Hands-on experience building features on top of LLMs in production: tool-calling, RAG, multi-step reasoning, and context management
- Familiarity with A2A (Agent-to-Agent) communication patterns and multi-agent orchestration frameworks
- Familiarity with MCP at the client/consumer layer: how agents discover and invoke tools via MCP
- Working knowledge of prompt engineering and LLM behavior across model families; you understand why Claude and GPT respond differently to the same prompt
- Experience with LLM evaluation, grounding assessment, or regression testing for AI-powered systems
- Awareness of token economics at the application layer: cost-per-query, context budget management, and prompt efficiency
- Experience on Microsoft Azure: Azure Functions, API Management, Container Apps, or Azure OpenAI Service
- Familiarity with secrets management, least-privilege access, and security-conscious engineering practices
- Experience in startup or early-stage environments: comfort with ambiguity, rapid iteration, and wearing multiple hats
- Experience in healthcare, life sciences, or other regulated domains is a plus but not required
- Why This Role
- You will work on a production system that real users depend on every day to do meaningful work.
- You will not be one of hundreds of engineers on a feature nobody uses.
- You will see the impact of what you build quickly, get direct feedback, and have real ownership over your work.
Apply tot his job
Apply To this Job