Excellent knowledge of the general principles of Big Data
Experience with distributed systems
Excellent knowledge of Spark (Core, SQL, Streaming)
Excellent knowledge of Java, Scala, or Python
Experience with AWS or other cloud providers
Experience with Big Data solutions
Good spoken English
Knowledge of Kafka
Knowledge of Apache Airflow
Infrastructure management (DevOps)
Knowledge of PySpark
Knowledge of Databricks
Experience with Kubernetes
• Professional Development:
— Experienced colleagues who are ready to share knowledge;
— The ability to switch projects, technology stacks, try yourself in different roles;
— Over 150 courses for workplace-based training
— Study and practice of English: courses and communication with colleagues and clients from different countries;
— Support of speakers who make presentations at conferences and meetings of technology communities.
• The ability to focus on your work: a lack of bureaucracy and micromanagement, and convenient corporate services;
• Lack of dress code, friendly atmosphere, concern for the comfort of specialists;
• Flexible schedule and the ability to work remotely;
• The ability to work in any of our development centers.
Our client is a New York City-based FinTech startup that is building a trading and risk software platform for the investment banking industry to sell as a service.
We’re looking for a Big Data engineer who will work on building new parts of scalable Data Lake and expand the existing ones based on cutting edge technologies (Spark, Hadoop, Databricks). The Data Lake is used by quantum analysts for ad hoc queries and by data scientists to build models for predicting investment opportunities. Data enters the Data Lake using ETL ingestion pipelines implemented with Apache Airflow. The specialist will join a distributed team on DataArt’s side that consists of 30 people from Ukraine, Russia, Poland, and the US.