— Bachelor degree in Computer Science, similar technical ﬁeld of study or equivalent practical experience.
— Commercial experience developing Spark Jobs using Scala
— Commercial experience using Java and Scala (Python nice to have)
— Experience in data processing using traditional and distributed systems (Hadoop, Spark, AWS — S3)
— Experience designing data models and data warehouses.
— Experience in SQL, NoSQL database management systems (PostgreSQL and Cassandra)
— Commercial experience using messaging technologies (RabbitMQ, Kafka)
— Experience using orchestration software (Chef, Puppet, Ansible, Salt)
— Conﬁdent with building complex ETL workﬂows (Luigi, Airﬂow)
— Good knowledge working cloud technologies (AWS)
— Good knowledge using monitoring software (ELK stack)
Technologies They Use
— Frameworks: DropWizard, React, Akka and Play Framework (Scala)
— Databases: PostgreSQL, AWS(S3), Redshift, Redis, MongoDB, Cassandra
— Technologies: RabbitMQ (messaging), Quartz scheduling, Docker and Kubernetes, Maven
— CI/CD: TeamCity, Jenkins
— Source Control: Git (GitHub)
— Other Tools: IntelliJ IDEA, Jira, Grafana
— Startup Engineering culture
— Good work/life integration (ﬂexible working)
— Untracked annual leave
— Stock options from day one
— Individual coaching program
— Free Trainers when you join their team
— Social activities to join in
— Build services/features/libraries that serve a deﬁnitive examples for new engineers and makes major contributions to library code or core services
— Design low-risk Spark process and write effective complex Spark jobs (data processes, aggregations, pipeline)
— Design low-risk APIs and write complex asynchronous, highly parallel low latency APIs and processes
— Work as part of team to maintain, improve, monitor data collection processes using Java and Scala
— Write high quality, extensible and testable code by applying good engineering practices (TDD, SOLID) using Engineering Practices
— Understand and apply modern technologies, data structures, and design patterns to solve real problems efﬁciently
— Understand data architecture and uses appropriate design patterns and designs complex database tables
— Support TA and Data Science team to help deliver and productions their backlog/prototypes
— Take ownership and pride in the products they build and always make sure they are of the highest standard
— Be empathetic towards team members and customers
Our partner is a market leader in developing complex ETL and machine learning solutions. With published authors and award-winning data scientists who contribute some of the major machine learning and distributed data technologies such as Apache Spark, they are a friendly, passionate group of engineers making a career out of building great software for their customers.
They are dealing with hundreds of millions of data points every day, generated from over two thousand data processes running through workﬂows, huge distributed computations in spark, streaming data coming in twenty-four hours a day at hundreds of times a second.
Their engineering culture is underpinned by sharing knowledge, coaching, and growing together. You will have the opportunity to explore/innovate new technologies, mentor engineers, and lead Technology initiatives. You will enjoy this role if you love writing code, learning cutting edge new technologies, solving problems, and winning as a team.
As a Data Engineer, you will be working across their entire stack, so a real passion to drive the product and technology forward is something that they value. Your responsibilities will include helping with a vision for the future architecture of this complex data system, adding innovative ideas that use the latest cutting edge technology. You will work closely with Web and Data Science teams to deliver user-centric solutions to their customers and become an expert in developing high-quality technical solutions