Lead Software Engineer – ML & Agentic Workloads

Lead Anywhere Remote Ai Jobs by Marathon Digital Holdings

About the Role

You will design, build, and scale systems that power agentic and intelligent workloads. You will lead production ML integrations from model selection and evaluation to deployment pipelines, implement guardrails for content safety and hallucination control, and create prompt lifecycle pipelines with versioning and CI/CD. You will build and optimize retrieval-augmented generation systems, configure vector databases and retrievers, define observability and evaluation metrics, and collaborate with other teams to design scalable APIs and services. You will mentor engineers and drive best practices for secure AI development and privacy-preserving data handling.

Requirements

8+ years of professional software engineering experience including 3+ years in ML application development or AI platform engineering
Proficiency in Python and ML toolchains such as PyTorch and Hugging Face
Experience with model evaluation, fine-tuning, and deployment across cloud and on-prem environments
Hands-on experience with RAG architectures and vector databases (Weaviate, Milvus, pgvector, LanceDB, FAISS)
Deep understanding of prompt design orchestration and versioning with CI/CD and automated testing
Familiarity with agentic systems and visual-builder interfaces
Knowledge of guardrail techniques including rule-based filters and policy evaluators
Experience deploying ML systems on Kubernetes and serverless environments with observability tooling
Solid understanding of API design, microservice architecture, and data pipeline integration
Excellent communication and leadership skills

Responsibilities

Lead architecture and development of agentic platforms
Evaluate and deploy foundation and open-source models
Design and maintain prompt lifecycle pipelines with version control and CI/CD
Build and optimize retrieval-augmented generation systems
Implement guardrail frameworks for content safety and hallucination control
Integrate and extend agentic frameworks and visual orchestration tools
Design scalable APIs and services for model-driven applications
Define observability and evaluation metrics for model performance
Drive secure AI development and privacy-preserving data handling
Mentor engineers across ML backend and platform domains

Lead Software Engineer – ML & Agentic Workloads

About the Role

Requirements

Responsibilities

Skills

Similar Jobs