We’re looking for Site Reliability Engineers- a cross between system and software engineers who are responsible for all operational aspects of client’s ecommerce platform. The team is responsible for designing, building, monitoring, and maintaining the infrastructure of our internet-facing and internal services. We’re looking for engineers who want to be a part of developing infrastructure software, maintaining it, and scaling company’s technology stack. Come help us build a bigger and better product as a Site Reliability Engineer.
Ideal candidates will possess the ability to discuss complex technical concepts with a diverse audience across all areas of the organization. They will remain calm under pressure and always strive to add structure to high-pressure, fast paced tasks or projects.
What you’ll need:
— At least 5 years of experience working in an SRE role or similar.
— Hands on experience with orchestration and system configuration tools such as Ansible, Puppet, Chef, — Terraform, etc.
— Expert in building and maintaining highly available applications including redundancy, fail over, scalability, monitoring and performance.
— Strong experience with virtualization, monitoring and automation.
— Software development experience (both scripting and “programming” languages).
— Experience working with open source community (troubleshooting, patch submission, etc.).
— Demonstrated 5+ years of Linux System Administration.
— Experience with CI tools such as Bamboo, Jenkins, Hudson.
— Ability to organize, troubleshoot and continuously learn.
— Previous experience working within controls such as SOX, PCI, etc.
— This position requires travel.
We offer multiple benefits, that include:
— Challenging work in an international professional environment
— Long standing team as this is for a long term project
— Mature and highly professional leadership team on Client’s side
— Mastering English language with a native speaker
— Flexible work-from-home policy
— Competitive salary
— PE accounting and support
— 20 paid vacation days per year
— 14 paid sick leaves per year
— Annual 250$ deposit for attending external events (conferences, workshops, etc.)
— Collaborative friendly team environment
— Cozy fully equipped office space in the city center (near “Palats Ukraina” subway station)
What you’ll do:
— Focus on service stability and reliability by working with application owners to set SLOs, “Error Budget” and backup and DR strategies
— Define application monitoring and alerting strategy
— Perform capacity planning and production readiness assessment
— Embed with product teams during the design and requirements phase of new product development through to initial production launch
— Identify requirements for other operational teams (release engineering, automation, etc.) during application development phase
— Be a technology and Devops evangelist for the rest of the company
— Participate in on-call rotation for level 3 support escalations
The client’s product is revolutionizing the pet industry as one of the fastest growing e-commerce retailer of all time. It offers a convenient way to shop for pet supplies within a highly personalized experience that’s fueled by superior customer care. Headquartered in South Florida, the team of over 7000 members—dispersed across our Customer Service Center(s), corporate offices and fulfillment centers—dedicate themselves to delivering pet happiness nationwide. Company’s environment is dynamic and faster than anything you’ve ever experienced, built for leaders who thrive on delivering results.