Сучасна диджитал-освіта для дітей — безоплатне заняття в GoITeens ×
Exadel is an international IT company headquartered in the USA. We design software solutions, deliver digital platforms, and have been creating unique products for Fortune 500 customers for 25+ years.With 30+ offices across the US, Europe, Caucasus, and Asia, Exadel addresses the most complex engineering problems with innovative solutions.
16 квітня 2021

Senior Data Engineer (вакансія неактивна)

Київ, Харків, Львів, Одеса, Вінниця

Необхідні навички

3 + years’ experience building, maintaining, and supporting complex data flows with structural and unstructural data
Proficiency in Python and PySpark
Hands-on experience with HDFS, HIVE, and SQOOP
Experience building data-pipelines/microservices with Apache Kafka
Experience in Apache Airflow to orchestrate and schedule complex data flows
Ability to use SQL for data profiling and data validation
Experience in Unix commands and scripting
Practical knowledge on AWS components such as EMR and S3
Master’s or Bachelor’s degree

Preferred Qualifications:
Experience and understanding of Continuous Integration and Continuous Delivery (CI/CD)
Understanding in performance tuning in distributed computing environment (such as Hadoop cluster or EMR)
Familiarity with BI tools (such as Tableau and MicroStrategy) and high comfort level using data modeling techniques

Пропонуємо

You’ll build your expertise with Sales Support, which provides assistance with existing and potential projects
You can join any Exadel community or create your own to communicate with like-minded colleagues
There are opportunities for continuing education as a mentor or speaker
You can take part in internal and external meetups as a speaker or listener
You’ll have the chance to improve your English skills with the help of native speakers
We participate in cultural, sport, charity, and entertainment events, and we’d love to have you there, too!

Обов’язки

Build end-to-end data flows from sources to fully curated and enhanced data sets. This can include the effort to locate and analyze source data, create data flows to extract, profile, and store ingested data, define and build data cleansing and imputation, map to a common data model, transform to satisfy business rules and statistical computations, and validate data content
Modify, maintain, and support existing data pipelines to provide business continuity and fulfill product enhancement requests
Provide technical expertise to diagnose errors from production support teams
Coordinate within on-site teams as well as work seamlessly with the US team
An ideal candidate will develop and maintain exceptional SQL code bases and expand our capability through Python scripting

Про проєкт

The Enterprise Analytics team at the customer’s company has an open position for a Senior Data Engineer. The team builds platforms to provide insights to internal and external clients of customer’s businesses in auto property damage and repair, medical claims, and telematics data. The customer’s solutions include analytical applications for claim processing, workflow productivity, financial performance, client and consumer satisfaction, and industry benchmarks.

About the Customer:
The сustomer is a leading provider of vehicle lifecycle solutions, enabling the companies that build, insure, repair, and replace vehicles to power the next generation of transportation. The company delivers advanced mobile, artificial intelligence, and connected car technologies through its platform, connecting a vibrant network of 350+ insurance companies, 24,000+ repair facilities, OEMs, hundreds of parts suppliers, and dozens of third-party data and service providers. The customer’s collective set of solutions inform decision-making, enhance productivity, and help clients deliver faster and better experiences for end consumers. 

The сustomer’s company was ranked #17 in the Top 100 Digital Companies in Chicago in 2020 by Built in Chicago, an online community for digital technology entrepreneurs in Chicago, and was named one of Forbes best mid-sized companies to work for in 2019 — an important accolade and retention tool for the 2,600+ full-time company employees (alongside 350 dedicated contractors).
The сompany’s corporate headquarters is in downtown Chicago in the historic Merchandise Mart — a certified LEED (Leadership in Energy and Environmental Design) building that is also known to be a technology hub within the broader metro.

About the Project:
The customer has been working on an analytics platform since 2018. The platform is on Hadoop and the Hortonworks Data Platform, and the customer is planning on moving it to Amazon EMR in 2021. The customer has a variety of products, the data for all of which comes into one data lake on this analytics platform, which also allows the customer to do next generation analytics on the amassed data.

Architecture:
Hortonworks is the current vendor. It will be replaced by Amazon EMR. Tableau is going to be the BI vendor. Microstrategy currently exists and will be phased out by early 2023.

All data is sent to the data lake, and the customer can do industry reporting. These data are used by a data science team to build new products and an AI model.

We will be moving to real-time streaming using Kafka and S3. We are doing POC to use Dremio and Presto for the query engine.

We’re migrating to version 2.0 using Amazon EMR and S3, and Query engine is bucketed under 2.0 project.

Project Advantages:
Cross product analytics
Analytics for every new product customer has. Analytics team products is how the customer sells the products value to clients
Quarterly Business Review meetings use data to explain how customer’s product is helping clients in their business
You’ll get to work with a cross-functional team
You will learn the customer’s company business

Project Tech Stack:
Technologies used are all open source Hadoop, Hive, PySpark, Airflow, Kafka to name a few

Project Stage:
Active Development

Гарячі вакансії

Всі вакансії