Confidential

AI Engineer

Full TimeRemote$200k–$275k

Apply to this roleTakes ~2 minutes. Consent-first, your data, your control.

About the role

Overview: We are seeking a proactive AI Systems Engineer to design, implement, and maintain end-to-end AI systems that scale. You will bridge the gap between data science, software engineering, and operations, ensuring robust, reliable, and secure AI-enabled solutions that meet business objectives. The ideal candidate combines strong systems engineering discipline with hands-on experience in AI/ML model deployment, MLOps, and production-grade software. Key Responsibilities: Design, deploy, and operate production-grade AI systems and pipelines (data ingestion, preprocessing, model training, validation, deployment, monitoring, and retraining). Collaborate with data scientists to translate research models into scalable, maintainable, and observable services. Implement MLOps practices: versioning for data, models, and code; CI/CD for ML pipelines; automated testing and canaries; model governance and drift monitoring. Build and maintain scalable data architectures (ETL/ELT, streaming, data lakes/warehouses) with emphasis on data quality, lineage, and observability. Develop APIs and services for model inference, including high-throughput, low-latency endpoints; ensure security, authentication, and access controls. Design and implement monitoring, alerting, and incident response for AI systems (model performance, data quality, system health, latency, cost). Optimize infrastructure for cost, performance, and reliability (cloud platforms, containers, orchestration, GPUs/accelerators, edge devices where applicable). Ensure compliance with privacy, security, and regulatory requirements; implement audit trails and reproducibility. Collaborate with product managers and stakeholders to define requirements, success metrics, and acceptance criteria. -Mentor junior engineers, contribute to standard methodologies, documentation, and best practices. Required Qualifications: Bachelor's or Master's degree in Computer Science, Software Engineering, Electrical Engineering, Analytics, or related field (or equivalent practical experience). 3+ years of experience in systems engineering, ML/AI deployment, or MLOps. Strong software engineering skills: proficiency in one or more general-purpose languages (e.g., Python, Java, Go, C++) and familiarity with software engineering best practices (version control, testing, code reviews). Experience architecting and deploying end-to-end AI pipelines (data ingestion, feature engineering, model training, deployment, and monitoring). Hands-on experience with ML frameworks (TensorFlow, PyTorch, scikit-learn) and model serving platforms (TensorFlow Serving, TorchServe, MLflow, Kedro, Seldon, or similar). Proficiency with cloud platforms (AWS, Azure, GCP) and containerization (Docker), orchestration (Kubernetes), and CI/CD tooling. Strong understanding of data engineering concepts (ETL/ELT, data governance, data quality, lineage). Experience with model monitoring and drift detection, A/B testing, and experimentation pipelines. Familiarity with security and compliance practices (IAM, secrets management, encryption, audit logging). Excellent problem-solving, communication, and collaboration skills; able to work cross-functionally. Preferred Qualifications: Master’s or PhD in a relevant field; specialization in ML systems, MLOps, or data engineering. Experience with real-time inference, streaming data (Kafka, Kinesis), and feature stores. Knowledge of DevOps fundamentals, SRE practices, and reliability engineering for AI systems. Experience with edge AI deployments or on-device inference. Publications or contributions to open-source ML systems projects. Responsibilities and Metrics: System reliability: target MTTR, uptime, and automated recovery. Throughput and latency: meet strict SLAs for inference requests. Data quality: maintain data lineage, accuracy, freshness, and completeness. Model performance: track drift, monitoring KPIs, and trigger retraining upon thresholds. Cost efficiency: optimize cloud and compute resources, monitor spend. Security/compliance: ensure auditability and adherence to applicable regulations.

Responsibilities

Key Responsibilities: Design, deploy, and operate production-grade AI systems and pipelines (data ingestion, preprocessing, model training, validation, deployment, monitoring, and retraining). Collaborate with data scientists to translate research models into scalable, maintainable, and observable services. Implement MLOps practices: versioning for data, models, and code; CI/CD for ML pipelines; automated testing and canaries; model governance and drift monitoring. Build and maintain scalable data architectures (ETL/ELT, streaming, data lakes/warehouses) with emphasis on data quality, lineage, and observability. Develop APIs and services for model inference, including high-throughput, low-latency endpoints; ensure security, authentication, and access controls. Design and implement monitoring, alerting, and incident response for AI systems (model performance, data quality, system health, latency, cost). Optimize infrastructure for cost, performance, and reliability (cloud platforms, containers, orchestration, GPUs/accelerators, edge devices where applicable). Ensure compliance with privacy, security, and regulatory requirements; implement audit trails and reproducibility. Collaborate with product managers and stakeholders to define requirements, success metrics, and acceptance criteria. -Mentor junior engineers, contribute to standard methodologies, documentation, and best practices.

Qualifications

Required Qualifications: Bachelor's or Master's degree in Computer Science, Software Engineering, Electrical Engineering, Analytics, or related field (or equivalent practical experience). 3+ years of experience in systems engineering, ML/AI deployment, or MLOps. Strong software engineering skills: proficiency in one or more general-purpose languages (e.g., Python, Java, Go, C++) and familiarity with software engineering best practices (version control, testing, code reviews). Experience architecting and deploying end-to-end AI pipelines (data ingestion, feature engineering, model training, deployment, and monitoring). Hands-on experience with ML frameworks (TensorFlow, PyTorch, scikit-learn) and model serving platforms (TensorFlow Serving, TorchServe, MLflow, Kedro, Seldon, or similar). Proficiency with cloud platforms (AWS, Azure, GCP) and containerization (Docker), orchestration (Kubernetes), and CI/CD tooling. Strong understanding of data engineering concepts (ETL/ELT, data governance, data quality, lineage). Experience with model monitoring and drift detection, A/B testing, and experimentation pipelines. Familiarity with security and compliance practices (IAM, secrets management, encryption, audit logging).

How your application is handled

When you apply, your resume and cover letter are uploaded and structured by Claude into hiring-relevant fields. You see every consent toggle individually and can withdraw at any time. We do not sell your data, do not share it with third-party brokers, and do not infer protected-class attributes. Read more on our privacy page.

Apply to this role