5+ years’ of related experience building data products with Python
Solid knowledge in algorithms and data structures
Experience with AWS technologies — Glue, Lambda, Step Functions, S3, etc.
Experience with developing data pipelines based on one of the mainstream frameworks like Spark/Flink/Presto, etc.
Experience with developing data lakes and data warehouses based on one of the mainstream technologies like Hive/Snowflake/OLAP/Hudi, etc.
Knowledge in SQL and solid experience with NoSQL or RDBMS technologies
Experience with data integrations and different data formats CSV, Protobuf, Parquet, Avro, ORC, etc.
Solid understanding of technical excellence and hands-on experience with code reviews, test development, and CI/CD
An understanding of what it means to build data processing systems at scale — if you love articles on highscalability.com you’ll fit right in
Comfort in a Linux ecosystem
Owning the entire pipeline and keeping the costs down, the performance up, and the engineering process optimal
Developing platform features for customers to monitor and analyze their marketing campaigns
Scaling the distributed system and infrastructure to the next level
Optimizing algorithms and software architectures to save cloud hosting costs
Beeswax (www.beeswax.com/about) provides extremely high-scale Bidder-as-a-Service solutions in advertising technology, works with global businesses, and has to date raised $28M (incl. the most recent Series B raise of $15M).
Sigma Software works together with Beeswax to enable the delivery of numerous key components of the platform, and is looking for engineers to complement Beeswax engineering team and drive further development of the platform.
The project is about building the next generation of real-time bidding software that enables sophisticated marketers to break free from the limitations and constraints of opaque, one-size-fits-all programmatic buying platforms.
The streaming applications are written in Java, use Kinesis and handle a variety of scaling and configuration challenges.Batch ETL is managed by Airflow DAGs. Having an excellent working knowledge of SQL is critical as we do a number of the ETL steps in Snowflake, and a poorly written query could have a significant performance and cost impact
As a Lead Data Platform Engineer, you will be involved in direct ownership and technical supervision of global distributed engineering teams, including engineers in Ukraine and the U.S.