Hack4Retail is a hackathon for real data scientists from technology fans that change Ukrainians’ user experience. 29-31 October 2021 — bit.ly/3DkbMwh Ukraine is the #1 software development destination in Central and Eastern Europe and the #4 most significant exporter of IT products and services in the world.
20 октября 2021

Data Engineer (Python)

Киев

Our partner’s Core AI team is in charge of building models for the next generation of AI-powered TV products. They are responsible for the end-to-end development of their models, including:

  • dataset collection using geographically distributed television labs;
  • model training in the cloud using serverless GPU clusters;
  • model optimization for constrained computation on the edge;
  • model testing using both virtual and real televisions; and
  • creation of the data pipelines and tooling that makes the above possible.

As a member of their team, you will be working at the intersection of engineering, science, and entertainment. As a Data Engineer, you make it all possible. You enable the team by building robust data pipelines and tooling for accelerating the Model Development Life Cycle.

Responsibilities

  • Design, build and maintain data pipelines (for training/test video from television labs, model artifacts, model evaluation results, and summaries, etc.) that are scalable, robust, and secure
  • Promote great software engineering practices and help improve our processes and establish new ones
  • Enable Machine Learning engineers to succeed in the end-to-end model development process by designing tools and processes that simplify working with labeled data, features, models, and relevant metrics

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or a related field
  • 4+ years of professional development experience building high-performance, large-scale applications/pipelines with solid experience in Python
  • Solid foundation in computer science, with strong competencies in data structures, algorithms and software design
  • Experience with at least one distributed computation framework (Spark, Hive, Dask, Metaflow, etc.)
  • Experience with at least one job orchestration framework (Airflow, Luigi, etc.)
  • Strong command of Linux and version control systems
  • Strong verbal and written communication skills

Preferred Qualifications

  • Experience with components of modern Machine Learning architectures—feature stores, model stores, evaluation stores, etc.
  • Familiarity with cloud providers and serverless architectures (Amazon Web Services, Google Cloud, etc.)
  • Familiarity with container orchestration tools (Kubernetes, ECS, etc.) in a production setting
  • Experience working with video—video capture, video processing, transcoding, frame analysis, ffmpeg
LinkedIn