DataRobot is a Boston-based tech company with offices in New York, London, Kyiv, Singapore, Tokyo, and Sydney.
27 июля 2021

DevOps Engineer, Code & Architecture (вакансия неактивна)

Киев, Харьков, Львов, Одесса, Хмельницкий, удаленно

Job Summary
DataRobot manages a variety of deployments for our cutting-edge AutoML, Time Series, and MLOps products. While we have several multi-tenant SaaS production environments in AWS, we also ship regular enterprise software releases for the diverse environments of our on-prem customers. You will play a key role in how the DataRobot tools and practices enable seamless scale while preventing failures with world-class observability.

The Code & Architecture team is looking for a DevOps Engineer to help us to build a world-class observability framework for multi-cloud complex environments. You’ll be working in close collaboration with engineering technical leadership to develop best monitoring and scalability tooling. We value engineers who are experts with DevOps tools and practices, who know how to build scalable & highly available infrastructure, and who are eager to chase challenges no matter where they lead. We will be excited to share our unique culture in a fast-moving startup environment.

Responsibilities

  • Adoption of the multi-account cross-region AWS infrastructure
  • Develop and improve instrumentation for monitoring and logging the health and availability of services.
  • Infrastructure and configuration management as a code
  • Improve operational efficiencies via scripting, bots and integrations.
  • Automation and maintenance of the existing infrastructure.
  • Motivate, encourage, and provide technical leadership to team members .

Main Requirements

  • 3+ Years experience with AWS (multi-account, cross-region)
  • 3+ Years experience with Docker and container orchestration (Kubernetes, Mesos, etc)
  • A passion for DevOps methodology and automatization
  • Experience maintaining large scale & geo-distributed infrastructure, 1k+ servers
  • Expertise in running complex monitoring & logging systems (Prometheus / Grafana; ELK, etc)

Desired Skills

  • 3+ Years of Unix systems administration
  • 3+ Years experience with Terraform/CloudFormation or Ansible
  • Solid experience in automating with Python, Go
  • Understanding of SLI/SLO fundamentals
  • A passion for collaborating and tearing down communication silos
  • Experience being technical lead