We are ITernal Group, a reliable and reputable IT company that specializes in complex software solutions, the one company established by merging 3 companies into one single organization in 2019. Every one of our individual companies had a background in different industries and technologies. The oldest of companies was founded in 2004.
24 січня 2022

Site Reliability Engineer (вакансія неактивна)

Київ, Харків, віддалено

Project Description:
Our client is a Fortune 500 market leader in insurance and financial services. We are assembling a global SRE team to provide 24/7, round-the-clock system reliability coverage for the client’s application and data infrastructure hosted on Microsoft Azure.

Experience 8+ years

Key Skills:

Azure, AKS, Terraform, Ansible & Trouble

shooting skills


• Provide Level 2.5/3 support to monitor applications

• Cultivate Scrum, CloudOps, DevOps, and SRE best practices among the team members

• Manage the applications deployed in AKS environments on Azure cloud and take care of scalability, replication, modularity, and other benefits offered.

• Troubleshoot production incidents in real-time and proactively identify system anomalies on Azure and AKS environments.

• Lead root cause investigations, work with Application/Microsoft and put closure to the observed issues in applications

• Design & Build Infrastructure as Code and necessary automation required to setup Cloud environments, CI/CD pipelines, pipelines in Azure Cloud with ARM, AzureDevOps YAML, Terraform

• Monitor the application performance and make sure that needed security standards are implemented and there is no breach of set standards.

• Work with senior engineering and testing team members to build tools and testing strategies for problem prevention, detection, and chaos testing.

• Guide reliability practices through the entire software development lifecycle through activities like architecture reviews, code reviews, creating platforms and frameworks, capacity planning, and working with the development and other stakeholders.

• Improve service reliability through blameless post-incident reviews and using code to prevent or respond to problem recurrence.

• Work on a rotation basis and be able to support on weekend during assigned schedule.

• Design and create centralized robust logging, monitoring systems and alerting systems.

Must have:

• Expertise in Azure Cloud technologies and solutions.

• Expertise with container technology and orchestration (AKS — Kubernetes, Docker).

• Expertise with IAC tools (Terraform, CloudFormation)

• Expertise with configuration management tools like Ansible.

• Expertise in troubleshooting skills and providing sustainable solutions

• Proficiency with tools like Git, Bitbucket

• Expertise with Monitoring tools like AppDynamics.

• Experience with Log management and ELK Stack. (Elastic Search, Logstash, Kibana)

• Understanding of the Application servers, Network, and Databases.

• Excellent understanding of Scalability processes and techniques.

We Offer:

• Possibility to influence the development of the project.

• Friendly professional staff and warm atmosphere.

• Help with development via mentoring and coaching.

• The environment where you can implement your ideas.

• Plans for growth and the performance review (every 6 months).

• Flexible schedule and opportunities to working remotely (8 hours workday).

• Paid vacation and sick leaves.

• Medical insurance, gym.

• Participation in educational activities and thematic conferences.

• Team parties and corporate events.