• Designing and maintaining DevOps project solutions for application deployment, infrastructure management, disaster recovery etc.
• Proactively identifying and implementing improvements in high availability, performance, and observability of the product.
• Collaboration with other teams to ensure all systems meet the organization’s standards for security and reliability.
• Debug and troubleshoot configuration-related issues
• 5+ years of experience in DevOps or Site Reliability engineering
• Strong Linux and Networking skill set
• Experience with Docker containers and orchestration solutions (Kubernetes, Nomad)
• Solid experience in configuration management and IaC (Ansible, Terraform)
• Knowledge and experience with Cloud-native tools: CI/CD (ArgoCD, Flux, Jenkins), Networking and Security (Calico, Cilium), App Definition and Build tools (Helm, Kubernetes operators), Service Mesh (Istio), Monitoring (Prometheus stack), Logging (Elastic stack, Grafana Loki, Fluentd, Fluentbit)
• Programming skills in Python/Bash
• Intermediate English
Nice to have:
• Couchbase, MongoDB or another NoSQL database
• Experience with cloud providers: AWS, GCP