• Expertise in Java
• Knowledge of Scala to be able to work with Spark
• Data engineering skills, experience in high volumes of data are must-have
• Experience in working with Cloud AWS or GCP (preferable)
• Experience in working with GCP Dataproc (Apache Spark), GCP Dataflow (Apache Beam)
• Good understanding of GCP BigQuery, GCP Cloud storage
• Knowledge of Jenkins, AirFlow with Groovy and Python correspondingly
• Familiarity with Design Patterns, Clean Code, Unit testing
• Experience working in Agile environment
• Data modelling skills would be a plus
• Excellent written and verbal communication skills
• Intermediate English level, both verbal and written (B1)
• Competitive compensation depending on experience and skills
• Individual career path in engineering
• Social package — medical insurance, sports
• Unlimited access to LinkedIn learning solutions
• Sick leave and regular vacation
• Partial coverage of costs for certification and IT conferences
• English classes with certified English teachers
• Flexible work hours
• Possibility to work on full product lifecycle —from concept to delivery into production
• Mentorship program
• Guaranteed professional growth through the technology trainings and technology communities inside EPAM
• Working in a team of proactive Agile/Scrum/XP practitioners
The project is aimed to create a data lake for one of the biggest data analytics company working with personal information both domestically and internationally. In a nutshell, this includes replatforming of on-premise Enterprise Data Hub from Hadoop cluster into GCP. Day to day tasks include but not limited to creating spark application which manipulates with data from different sources including Oracle, Google Cloud Storage, BigQuery; creating pipelines via GCP Dataflow; working with Jenkins and AirFlow.