We are:
Wix’s Monitoring Engineering team. We’re responsible for the day-to-day management of monitoring large-scale, mission-critical production systems that run on public clouds and in our data centers (AWS/GCE/Physical, as well as with Grafana/Prometheus/Clickhouse/usual dbs). We automate, manage and maintain the global infrastructure that makes Wix tick.
We’re looking for a service-oriented Site Reliability Engineer to join our team.
You are:
• Passionate about Linux system, automation, creating high-availability systems.
• You have 3+ years of SRE/DevOps Engineer experience.
• Can develop and maintain production-ready services.
• You have a good knowledge of system programming languages such as, Python, Ruby or Go.
• You have solid experience in debugging complicated production incidents in real-time and an understanding of Ops/DevOps principles.
As a Site Reliability Engineer, you will:
• Build, deliver, and maintain high-visibility production critical libraries and tools running in services across the company.
• Participate in code and design reviews and mentor and cultivate other engineers.
• Interact with developers across the company to understand their challenges, and work with leaders on the team to develop a roadmap for the application framework and tools to address.
• Drive engineering excellence on common libraries and tools that will be run in hundreds of projects across Wix.
• Provide a strong technical voice for your team.
• Work with internal Wix customers across the company, driving technical and tooling decisions to ensure a high bar in operational excellence can be reached.
•Shape the future of how Wix detects and responds to issues.