We are looking for a Data Engineer who is passionate and motivated to make an impact in creating a robust and scalable data platform. In this role, you will have ownership of the company’s core data pipeline that powers our top line metrics. You will also leverage data expertise to help evolve data models in various components of the data stack. You will be working on architecting, building, and launching highly scalable and reliable data pipelines to support the company’s growing data processing and analytics needs. Your efforts will allow access to business and user behavior insights, leveraging the data to fuel other functions such as Analytics, Data Science, Operations and many others.
* Proficient in SQL, specially with Postgres dialect.
* Expertise in Python for developing and maintaining data pipeline code.
* Experience with Apache Spark and PySpark library (experience with AWS extension of PySpark is a plus).
* Experience with BI software (preferably Metabase or Tableau).
* Experience with Hadoop (or similar) Ecosystem.
* Experience with deploying and maintaining data infrastructure in the cloud (experience with AWS preferred).
* Comfortable working directly with data analytics to bridge business requirements with data engineering
* Owner of the core company data pipeline, responsible for scaling up data processing flow to meet the rapid data growth
* Consistently evolve data model & data schema based on business and engineering needs
* Implement systems tracking data quality and consistency
* Develop tools supporting self-service data pipeline management (ETL)
* SQL and MapReduce job tuning to improve data processing performance
Our technical stack:
* Javascript (with Flow) codebase: NodeJS (Express), React, React Native
* GraphQL API (no REST)
* PostgreSQL DB (transactional)
* ClickHouse (columnar DB for data warehousing)
* Redis (session storage, task queue management)
* Sequelize ORM (exploring Prisma as a potential replacement)
* Jest test runner
* Infrastructure: Docker images orchestrated with Kubernetes, Ksonnet. Currently all hosted on AWS.