As a Data Engineer, you will be given challenging data to handle at a massive scale, build our data pipeline and lead every aspect regarding the ETLs.
Data Science team will also need your help to make their models production-ready. No rest for the wicked, though: after all of that is done, you’ll deal with the results while transforming them into actionable items for our clients and other internal stakeholders. If you got this far, and you’re not really scared, you really should apply.
You will work with a plethora of storage and processing technologies on top of cutting-edge application frameworks and infrastructure, bridging the gap between Data Science and Engineering.
— Excellent proficiency in SQL at least 5 years: Complex queries, SQL scripting, performance tuning.
— Background working with one of these MPP databases (in that order): Snowflake, redshift, big query, vertica.
— Excellent python development capabilities at least 5 years: Working with dataframes, data processing, data management.
— Experience working with SparkSQL & RDDs is a big advantage.
— Develop data pipelines and ETL using Airflow — must
— Infra, experience working with AWS data services at least 3 years: S3, EMR, Athena, Redshift, EC2, RDS, DMS, kineses, Lambda, SQS.
— A techie person who is passionate about expanding the knowledge in data technologies and always willing to learn more: NoSQL, Haddop, big data, kubernetes, docker and more
— Working with reporting tools such as: Tableau, sisense, looker
— Team player
— Design and development of new data pipelines in Airflow
— Integrate new data sources and apply various transformations to these data
— Support existing functions in the data engineering area
— Improve the automation of the data usage in the company
— Taking part in designing and building company data platform in the cloud
— Maintain technical documentation related to data platform