Our ML teams are part of our core Data Science and Machine Learning group and consist of Online and Offline ML teams. The Online ML (ML Platform) team is responsible for building and operating low-latency and data-intensive systems such as a feature store, feature extraction, ML model serving, and versioning systems. The Offline ML (ML Operations) team is responsible for the ML model release process, ML pipelines, model training, and validation.
ML operations team aims to deliver frequent, trusted & stable model releases to Sift customers by continuously evolving our process and tooling. We serve customers across multiple verticals such as online commerce, delivery service, finance, travel sites, etc., and we have customers in both developed and developing countries. Our technology helps protect Internet users from ever-evolving online scams, payment fraud, abusive content, account takeover, etc. They are a forward-thinking team constantly challenging themselves and the status quo to push the boundary of machine learning and data science across multiple product offerings at Sift and collaborate with product engineering teams to deliver tangible customer value.
The technical stack — Java, Apache Airflow, Apache Spark, Databricks, Dataproc, Snowflake, GCP (GKE, BigTable).
Opportunities for you:
- Professional growth: quarterly Growth Cycles instead of performance review;
- Experience: knowledge sharing through biweekly Tech Talks sessions. You will learn how to build projects that handle petabytes of data and have small latency and high fault tolerance;
- Business trips and the annual Sift Summit, in 2022, Summit took place in California;
- Remote work approach: you can choose where you work better.
What would make you a strong fit:
- You have a growth mindset and a strong interest in working with real-world machine-learning applications and technologies;
- Excellent communication skills and collaborative work attitude;
- 7+ years of professional software big data development experience;
- Experience building highly available low-latency systems using Java, Scala, or other object-oriented languages;
- Experience working with large datasets and data processing technologies for both stream and batch processing, such as Apache Spark, Apache Beam, Flink, and MapReduce;
- Knowledge of GCP or AWS cloud stack for web services and big data processing;
- Basic knowledge of MLOps on model release/training/monitoring;
- Conceptual knowledge of ML techniques;
- B.S. in Computer Science (or related technical discipline), or related practical experience.
If you possess 7+ years of experience in developing low-latency, high-load applications using Java or Scala, exhibit an interest in transitioning to big data engineering, demonstrate self-motivation in managing your development, and are willing to dedicate time to rapidly enhance your skills, we encourage you to apply for this role.
What you’ll do:
- Build massive data pipelines and batch jobs that are part of our model training process using the latest technologies such as Apache Spark, Flink, Airflow, and Beam;
- Prototype and explore with latest machine learning or analytics technologies such as model-serving frameworks, or workflow orchestration engines;
- Collaborate with machine learning engineers and data scientists and contribute to internal tooling, experimentation, and prototyping efforts;
- Simplify, re-architect, and upgrade the existing pipeline technologies.
Benefits and Perks:
- A compensation package that consists of financial compensation, a biannual 5% bonus, and stock options;
- Medical, dental, and vision coverage;
- 50$ for sports and wellness;
- Education reimbursement: books, education courses, conferences;
- Flexible time off: we follow an unlimited vacation approach;
- Tuned work schedule to Kyiv timezone despite US offices location: biweekly demo sessions are optional for our team and we watch them from recording;
- Mental Health Days: additional day-offs;
- English courses and social activities inside the company allow improving your public speaking and language.
Our interview process:
- 45-minute introduction call with a recruiter;
- 60-minute technical screen with the medium leet code problem-solving task;
- Virtual onsite with the team will take approximately 3,5 hours (system design, coding, deep dive, and values-based interview).
During our sessions, you will have the opportunity to learn about company culture, meet engineers from your team, and discuss distributed system problems. You will have time for all interesting questions and get transparency regarding your future responsibilities and the project.