• Experience with Java or Python.
• Solid knowledge in algorithms and data structures.
• Experience with AWS technologies — Glue, Lambda, Step Functions, S3 etc.
• Experience with developing data pipelines based on one of the mainstream frameworks like Spark/Flink/Presto etc.
• Experience with developing data lakes and data warehouses based on one of the mainstream technologies like Hive/Snowflake/OLAP/Hudi etc.
• Knowledge in SQL and solid experience with NoSQL or RDBMS technologies.
• Experience with data integrations and different data formats CSV, Protobuf, Parquet, Avro, ORC, etc.
• Solid understanding of technical excellence and hands-on experience with code reviews, test development and CI/CD.
• Experience with developing Snowflake-driven data warehouses.
• Experience with developing Kinesis-driven streaming data pipelines
• Contributing to new technologies investigations and complex solutions design
• Coming up with well-designed technical solutions and robust code
• Working and professionaly communicating with customer’s team
• Taking up responsibility for delivering major solution features
• Participating in requirements gathering and clarification process
• Developing core modules and functions
• Performing code reviews, writing unit, and integration tests
We are working with a rapidly growing US AdTech company Beeswax. Founded by three ex-Googlers, it has a highly technical team and an excellent technical culture
The project is about building the next generation of real-time bidding software that enables sophisticated marketers to break free from the limitations and constraints of opaque, one-size-fits-all programmatic buying platforms.
The streaming applications are written in Java, use Kinesis, and handle various scaling and configuration challenges. Batch ETL is managed by Airflow DAGs. Having an excellent working knowledge of SQL is critical as we do a number of the ETL steps in Snowflake, and a poorly written query could have a significant performance and cost impact
You will join the Big Data team on a customer’s side and cooperate closely to improve a platform looking into new features and become more efficient performance-wise. The team currently works on several tools and applications. By joining our team, you would work with some or all of them:
Manage and build on a high-scale event parsing and recording system. We use Kinesis to handle billions of events and ship them to S3, Snowflake, databases, and a variety of other logs, both internal and external
Manage and build on a set of ETL pipelines that move terabytes of data through Snowflake
Operate multiple services that provide real-time data flows to our internal systems (both the UI and the optimization engines)