- Knowledge of big data tools (any one of the following or similar tools — HDFS, Hive, Sqoop, Zookeeper, Spark, MapReduce2, YARN, Tez, Kafka , Airflow, Dremio, Presto)
- Experience with cloud hosting (preferably AWS)
- Experience with Kubernetes or Docker
- Experience monitoring cloud based systems
- Install, maintain, upgrade, troubleshoot & support cloud-based Hadoop clusters and related components based on big data technologies
- Ensure cloud-based Hadoop clusters are implemented to meet the Customer’s standards and service level requirements
- Provide performance tuning for AWS EMR Hadoop cluster environment
- Document the Customer’s cloud-based Hadoop clusters
- Interact with BI team, developers, architects to ensure big data applications are highly available and performant.
- Participate in technical incident calls to restore service related to Hadoop environment problems in production environments under time-bound situations
- Work in a highly collaborative environment
- Ability to quickly understand both business and technical concepts, innovate, prioritize and multitask
- Self-motivated and self-managed with a high degree of analytical and intellectual curiosity
- Ability to be a strong team member and communicate effectively
- Vacation is 20 working days / till 20 working days per year for sick leaves
- Full payment of taxes
- English courses
- Flexible work schedule
- Friendly environment
- Medical insurance
About the Customer:
The customer is an American company based in Chicago with more than 40 years of experience in the P&C insurance industry. The customer moved to the cloud in 2003, and it began using its first ML algorithms as early as 10 years ago. The company accelerates digital transformation for the insurance and automotive industries with AI, IoT, and workflow solutions.
The core product is a comprehensive SaaS platform that consolidates about 30,000 stakeholders, namely insurance companies, repair facilities, auto manufacturers, lenders, fleets, and everyone involved in resolving critical moments following an accident.
About the Project:
The customer has been working on an analytics platform since 2018. The platform is on Hadoop and the Hortonworks Data Platform, and the customer is planning on moving it to Amazon EMR in 2021. The customer has a variety of products, the data for all of which comes into one data lake on this analytics platform, which also allows the customer to do next generation analytics on the amassed data.
Project Tech Stack:
Technologies used are all open source Hadoop, Hive, PySpark, Airflow, Kafka to name a few
3 Big Data platform administrators based in the US (Central Time). Onsite Senior Big Data administrator will mentor the newcomer and bring him up to speed. This team works together with a lot of other product teams to support them.
Daily stand ups at 9:30 CT. Bi-weekly 1:1 meeting with the team lead.
- Working with and learning from experienced onsite big data platform administrators
- Analytics for every new product customer has. Analytics team products is how the customer sells the products value to clients
- Seeing the impact of your work — Quarterly Business Review meetings use data to explain how customer’s product is helping clients in their business