SPD Technology is a place where everyone knows how to develop awesome software, does that great, and wants to do that better. We write more than code, we create solutions with business needs in mind. We want to be a part of innovations. To make that, we’re ready to learn and gain new expertise.
29 січня 2024

Site Reliability Engineer (Java development experience)

Київ, Львів, Черкаси, віддалено

At SPD Technology, we bring together a team of like-minded people who are driven by the desire to bring value through their work, united in their commitment to high performance and delivering custom, cutting-edge tech solutions that drive clients’ growth. We empower our people with a culture of excellence and enable them with the opportunity to uphold their accountability to contribute on each level. We value humanity and collaboration, encourage professional and personal growth, and foster a supportive and flexible work environment where everyone’s contribution is welcomed.

We are looking for a Site Reliability Engineer to join us as part of our team.

About the project:

PitchBook — a platform for investment professionals. Our software provides access to data and the analytical tools to get answers fast and discover promising opportunities. Uncovers actionable insights and trends hidden within the financial data of more than three million companies. Users all over the world include large corporations, start-ups, venture capital and private equity firms, investment banks, and many others.
Features of PitchBook: Advanced search / Discovery & insights /Company profiles / Workflow & efficiency / Many more.

About the role:

As a Senior Site Reliability Engineer in PitchBook’s engineering team, you will be creating and evolving systems to automatically run our suite of products and services reliably and consistently.
You will utilize your strong background deploying, managing, and maintaining production systems, working with developers to operate and monitor large-scale services with complex distributed systems and data integrations. You will incorporate observability tools (monitoring, tracing, alerting), perform incident management, conduct root cause analyses, eliminate single points of failure, build reliability and redundancy into our infrastructure, establish and test our recoverability, mitigate failures, and do all these things through automation and tools.

For this role, a deep expertise in automating monitoring and mitigation through code is essential. You should demonstrate the ability to harness code for streamlined system responses, ensuring quick, consistent reactions to anomalies. This skill is crucial for maintaining our infrastructure’s resilience and scalability, positioning us at the pinnacle of operational excellence.

As a senior SRE, you will take independent responsibility for building and managing large subsets of our systems. You will help build our best practices for infrastructure-as-code and your code will exemplify our quality controls. You will mentor and train other SREs, platform engineers, and software engineers in reliability topics.

Your ability to collaborate with colleagues, exhibit poise and adaptability in stressful situations, communicate effectively, and build resilient systems that can be consistently relied upon will be critical to your success. You will solicit feedback, learn constantly, engage others with empathy, and help create a culture of belonging, teamwork, and purpose.

You’ll work with the following stack: GCP, Kubernetes, Gitlab CI, Java, Prometheus, ELK, Helm, PostgreSQL, RabbitMQ, Redis, MSSQL, Linux, Puppet, Terraform.

Team Structure: 1 Lead, 4 SRE specialists

Schedule: Flexible (with the need to attend team meetings)

As a qualified expert, You will

  • Participate in incident management processes, issues troubleshooting and conduct root cause analysis
  • Build recoverability into our services and systems, including disaster recovery (DR), backups / recovery, and incorporation of multi-AZ, multi-regionality into cloud constructs
  • Manage connectivity (CIDRs, VPCs, Subnets), latency, and availability across distributed systems
  • Establishe clustering and load balancing techniques for high availability and scalability in containerized cloud native environments
  • Build observability systems and services (monitoring, tracing) for reuse in our platform architecture, creating alerting for fault identification and building dashboards for metrics
  • Operate and continuously improves our services’ reliability, scalability, performance, security, and uptime
  • Learn constantly, including in available cloud managed services (PaaS/SaaS/IaaS), libraries, frameworks, and platforms (commercial and open source)

We’re looking for you if you have

  • 5+ years’ experience building and maintaining Windows/Linux/UNIX based systems, primarily in cloud environments (preferably GCP)
  • 4+ years’ experience coding in an object-oriented language, preferably Java or Kotlin
  • 2+ years’ experience with containers and orchestration platforms, including Kubernetes and Docker
  • Deep knowledge of infrastructure systems, networking, and security, including in a cloud environment
  • Experience with leading monitoring, logging, and tracing platforms (e.g., Prometheus, Grafana, ELK stack, Jaeger) to ensure comprehensive visibility into infrastructure and application health.
  • Experience owning operational reliability, scalability, recoverability (backups, disaster recovery, failover) and capacity planning
  • Experience performing operational activities including batch processing, system backups, maintenance, monitoring and providing on-call support
  • Experience with distributed, scalable microservices and event-driven architectures
  • Experience with data storage, replication, caching, and search technologies, such as PostgreSQL, MSSQL, GCP CloudSQL, Redis, Elasticsearch, and Lucene/Solr
  • Ability to design meaningful alerts with minimal false positives, coupled with strong analytical skills for troubleshooting and a drive to continually enhance monitoring practices.
  • English — B2 level

Bonus Points

  • Hold at least one professional certification in GCP (DevOps or SysOps Engineer preferred)

What’s in it for You

Reveal great tech solutions

Join the team of individuals who care about what they do and how they do it, and are accountable for the result and high performance. Unleash your potential, tackle new challenges, and be part of a team that values your skills and contributions.

Experience an agile and flexible working environment

Work fully remote or from our office hubs or in a hybrid work model. Enjoy 20 business days of paid vacation, unlimited sick leave, 4 days of emergency leave.

Feel cared about

Prioritize your well-being with a medical insurance yearly budget / financial reimbursement of expenses on medical services outside Ukraine. Get compensation for sports, equipment, massage, and rehabilitation, along with access to our well-being program, corporate loan, and tax and legal support.

Embrace the opportunity for personal and professional growth

Take advantage of individual learning and certification budget, career paths and personal development plans, company-wide tech and cultural events, educational leave, language courses, access to our corporate library, and more.

Interview steps:

  • Pre-screening Interview with recruiter
  • Technical interview
  • Interview with HR Partner
  • Interview with the client

About SPD Technology

SPD Technology is a custom software product development and IT consulting company with extensive expertise in various industries, including fintech, e-commerce, logistics, insurance, biotech, cybersecurity, and more. Our world-class team of over 600 experts develop web, mobile, AI/ML, and enterprise solutions for world-renowned companies, including Fortune 500 firms and emerging startups. We have 4 development centers in Europe [Ukraine], a representative office in London, the U.K., and remote teams, working worldwide. With over 17 years of experience in designing, building, streamlining, and supporting software products, SPD Technology drives growth of businesses from the US, the U.K., Israel, Switzerland, Mexico, and other countries.

Embrace the opportunity to innovate with us!

LinkedIn