secrethunter.io — לוח הדרושים של ישראל

DevOps Team Lead

AnyClip · rehovot

About the Role We are looking for a hands-on DevOps Team Lead to lead our multi-cloud infrastructure and drive the integration of large-scale platforms across AWS and GCP. You will be responsible for production reliability, cloud operations, observability, security, FinOps, and infrastructure strategy while leading and mentoring a team of DevOps engineers. This is a highly impactful player-coach role requiring both strong technical leadership and deep hands-on expertise in operating large-scale distributed systems, real-time data pipelines, and mission-critical production environments serving billions of events per day. Key Responsibilities Lead, mentor, and develop DevOps and Platform engineers while remaining highly hands-on. Own and evolve cloud infrastructure across AWS and GCP, including Kubernetes-based platforms (EKS/GKE), networking, IAM, storage, and core infrastructure services. Lead infrastructure integration efforts during acquisitions, platform consolidations, and cloud migration projects. Design, deploy, and maintain Infrastructure-as-Code using Terraform. Act as the primary escalation point for infrastructure and production issues. Lead incident response, post-mortems, and continuous operational improvements. Build and maintain observability platforms using Prometheus, Grafana, Datadog, and related tools, including monitoring standards, alerting strategies, SLOs, and SLAs. Support large-scale data pipelines, real-time event processing systems, and high-throughput production environments handling billions of events. Collaborate with engineering teams to improve reliability, observability, scalability, and performance across production systems. Troubleshoot and optimize large-scale distributed systems, including capacity planning and performance tuning. Lead cloud cost optimization initiatives across AWS and GCP, including FinOps practices, resource governance, and cost visibility. Support SOC2, ISO27001, and infrastructure security initiatives, implementing operational controls and security best practices. What You'll Bring 5+ years of hands-on experience managing large-scale production environments on AWS, with practical experience in GCP. Proven experience leading DevOps, SRE, or Platform Engineering teams, including mentoring engineers, driving operational excellence, and taking ownership of mission-critical production environments. Deep expertise in Kubernetes (EKS/GKE), cloud networking, infrastructure security, and Infrastructure-as-Code using Terraform, Karpenter, Keda. Experience with infrastructure tooling - Ansible, Chef. Strong experience supporting distributed data platforms and production services, including Kafka (MSK), Redis, OpenSearch, and similar technologies. Strong experience operating highly available distributed systems, large-scale data pipelines, streaming platforms, and real-time event processing environments. Hands-on experience with observability and production operations, including Prometheus, Grafana, Datadog, monitoring, alerting, incident response, root cause analysis, and performance optimization. Experience with capacity planning, cloud cost optimization (FinOps), and infrastructure governance. Experience leading infrastructure integration during acquisitions, platform consolidations, or large-scale cloud migrations. Strong troubleshooting skills and the ability to perform effectively under pressure in complex production environments. Bonus Points For Experience supporting SOC2, ISO27001, or similar security and compliance frameworks. Experience with AdTech, MarTech, Gaming, Analytics, or other high-scale data-driven platforms. Experience with ClickHouse, BigQuery, Redshift, Snowflake, or similar analytics platforms. Experience with VictoriaMetrics, Thanos, Cortex, ArgoCD, Flux, or other modern observability and GitOps tools. Proficiency in Python, Go, or Bash for automation and tooling.

הגשת מועמדות »