Senior Devops Engineer
Description Mentee Robotics is building autonomous humanoid robots with an AI-first approach, combining perception, reasoning, and dexterous manipulation in a robot that adapts and learns on the job. Our robot, Menteebot v3, is built to work in real industrial, logistics, and retail settings, handling complex tasks with human-like flexibility. Behind every robot is a deep software and ML stack that has to build, test, and ship reliably, at speed. We're looking for a Senior DevOps Engineer to build that foundation from the ground up. This is a builder's role: you'll be creating robotics-native pipelines, observability, and fleet deployment systems from scratch. Why Mentee Real ownership. You'll build the DevOps foundation from the ground up and set the practices the whole engineering team works by, not inherit and maintain someone else's playbook. See your impact fast. We're small enough that the systems you build are used across the company within days, not quarters. Innovation at the frontier. Work alongside researchers and engineers across ML, perception, control, and hardware, building infrastructure for complex challenges that require unique, first-principles engineering. Build for scale that matters. Robotics workloads (training, simulation, HPC) push infrastructure in ways typical web companies never hit. The technical depth here is real. Responsibilities Architect and build GitOps-based CI/CD pipelines from the ground up, from code integration through testing to deployment Build and scale our cloud infrastructure as code with Terraform Build observability and deployment systems that span cloud, simulation, and the robot fleet Design systems for performance, scalability, and reliability as workloads grow Build automation in Python and Bash to engineer away manual, repetitive work Debug deep, novel infrastructure problems and design out their root causes for good Establish the DevOps standards and practices the whole engineering team builds on Requirements 4+ years of DevOps experience building and scaling complex, large-scale production infrastructure Experience designing and building new infrastructure, not only operating what already exists Strong scripting skills in Python and/or Bash Deep working knowledge of Git and version control workflows Hands-on experience building CI/CD pipelines with GitHub Actions Strong Linux systems administration and hands-on debugging across systems, networks, and applications Advantages Centralized logging and observability platforms (e.g. ELK, Datadog) Familiarity with MLOps tooling and workflows Experience with robotics or related fields Working knowledge of Kubernetes and cloud-native tooling Experience managing HPC clusters (e.g. Slurm)