7+ years experience as software engineer
At least 3 years experience with Java/Scala and Spark
Must: Deep understanding of Spark’s internals: Query planning, Filesystem API, codegen, etc.
Nice to have: experience with PySpark Hive Metastore, Delta Lake, Apache Hudi or Apache Iceberg
Develop high end solutions to the technical challenges we will face when working to fulfil our mission using your deep knowledge of Spark and the hadoop ecosystem.
Maintain the product in production to ensure the best possible service to our customers
Build a product that fits the market by taking part in analyzing the feedback from the community and the design partners
Work in a trusted team and ensure others on the team feel safe and trusted.
Our project is a is an open source platform that delivers resilience and manageability to object-storage based data lakes, the solution that works by extending existing object stores (e.g. S3) with advanced Git-like capabilities.
By supercharging the storage layer with ACID-like guarantees based on atomic branching/commiting/merging of changes, We free developers from the risks and complexities of managing data at scale:
Changes can be reverted instantly, writers have full control of when their data becomes visible to its consumers, and by applying CI rules, we can enforce standards and avoid costly mistakes.
We believe in creating a safe and trusting environment, where we are all accountable for our work as a team and as individuals, we share the context so each one of us can take decisions and assist others in taking decisions.
We work for what we believe in, and we solve problems in a blame free environment.