We are looking for a seasoned Site Reliability Engineer to augment our team to support its strategy of driving products and technology into everything they deliver to accelerate the growth in business. As a SRE, you’ll work as part of a team of problem solvers, helping to solve complex business issues from strategy to execution.
The team covers a variety of responsibilities that are executed by DevSecOps, Site Reliability and ML Ops Engineers, including:
- Defining standard reliability and resilience for infrastructure and application components.
- Proactive optimization of redundancies, monitoring and alerting practices and patterns
- Developing resilient and highly available distributed systems.
- Infrastructure as Code development for building cloud tools.
- Secrets and configuration management
- Monitoring systems and services, providing incident and emergency response to triage and resolve system or client issues
- Management of the application ecosystem improving platform infrastructure and applications with high reliability, resiliency, performance, and quality
- Supporting documentation, knowledge articles, and runbooks
- Designing, building, and Implementing SRE patterns that adhere to our client’s security guidelines and policies.
- Work hours till 9-10pm Ukrainian time is required
- At least 4 years of relevant working experience.
- Advanced Kubernetes — Must have strong skills in Kubernetes at scale using one of GKE, AKS, EKS or RKE. Experience with Kubectl and Helm.
- Containers: Experience deploying Java (Spring Boot) microservices in dockerized environments.
- Observability — Experience in setting up tools like Prom/Grafana, Datadog, AppDynamics, Splunk. to give actionable intel on a microservice environment including but not limited to synthetics, Application performance monitoring, logging and Alerting (Pagerduty/OpsGenie Integrations).
- Good CI/CD expertise. Jenkins, Azure DevOps, Github Actions, ArgoCD, Artifactory, Azure container registry, Google container registry and other similar tooling.
- SCM - Working with tools like Github/Gitlab for source code management and well as experience with branching strategies like GitFlow and trunk based.
- Strong troubleshooting skills — Be able to move all the way down to code level to give development teams a head start on application issues. Effectively be able to contribute to root cause analysis exercises post problem resolution.
- Good Communication Skills - Active listening, verbal and non-verbal communication, Clarity and Concision, Confidence, Open-Mindedness, Respect.
- Good Documentation skills - Be able to effectively document any automation, technical efforts so as to ensure ease of adoptability of a solution.
- Good collaboration skills — Must be able to work effectively with Scrum/Dev teams with a push/pull (push back and prioritize work pulled in) philosophy to manage expectations and contribute to the stability and improvement of the platform.
NICE TO HAVE QUALIFICATIONS
- IAC - Terraform , Pulumi. Preferably developed modules in the past rather than just using them.
- Security — worked with encryption at rest, in transit patterns. Experience with tools like Azure Key vault, Hashicorp Vault, Google KMS.
- Security — Experience with tools like Veracode, Blackduck for AppSec testing, Qualys scanners for infra testing and Twistlock/Aqua for container scanning.
- Automation — Must be able to identify toil and opportunities to reduce that within the team.
- Authentication/Authorization — Familiarity with Authn/Authz schemes like OpenID, OAuth 2.0, SAML.
- Scripting and Programming — Experience with Python, Powershell, Go, Java, Node.
- Event Driven/Event Sourcing Patterns — Familiarity with distributed event streaming platforms like Kafka, EventHub, RabbitMQ and patterns like CQRS.
WHAT WE OFFER:
- Competitive salary and benefits package
- A brand-new office in a BC “Natsionalnyy” + remote working options
- 18 business days of vacation, 10 days of sick leave, national holidays off
- Compensation for technical conferences/events participation
- Free English classes to further enhance your language skills
- Gym and shower in the office
- Medical insurance
By joining devspiration, you will become part of a professional and patriotic team that supports the Armed Forces of Ukraine and the people who need it. If you are passionate about technology and want to make a meaningful impact, we invite you to join our team and help us shape the future. 💙