• DATAHUB solutions/patterns which include ODL(Operational Data Lake), ODS (Operational Data Store) and ADL(Analytic Data Lake).
• Clouds generally and proficiently build the Cloud-Native solutions and tools to design BigData PaaS/IaaS.
• BigData Ingestion technologies generally and associated with Cloud technologies specifically.
• BigData Processing/Transformation technologies generally and associated with Cloud technologies specifically.
• BigData Storage technologies generally and associated with Cloud technologies specifically.
• BigData Access technologies generally and associated with Cloud technologies specifically.
• BigData technologies generally and associated with Cloud technologies specifically.
• Data communication. You can turn complex data into clear, simple, and actionable stories. You will share data communication skills with the team and across the government.
• Data innovation. You can identify areas of innovation in data tools and techniques and recognize appropriate timing for adoption.
• Data modeling. You understand the concepts and principles of data modeling and can produce relevant data models across multiple subject areas. You know how to reverse-engineer data models from a live system. You understand industry-recognized data modeling patterns and standards and know when to apply them. You can compare and align different data models.
We are looking for highly motivated and result-oriented senior data engineer/solution architector, who can join our team in development of real-time data platform.
We want to develop event triggers based on ML & AI to define and get a next best product &/or service for our customers.
Team will be responsible for platform solution design and development with ambition plan to grow and onboard new teams & business hypothesis to this platform.
Here is the Draft Vision of our Future Data Platform:
• The Data Platform is divided into the following distinct parts:
o Core Data Platform
o Onboarded Data Producer/Consumer Applications, Configuration & Customisations
• Onboarding & Management of Data Producer/Consumer Applications
• Application, Data Source, Job, Job Data, Job Steps, and Data Access Taxonomies
• Standardized Push/Pull Streaming Ingestion Interfaces
• Standardized Push/Pull Batch Ingestion Interfaces
• Polyglot Data-lake Storage (e.g., Bucket, KV, NoSQL, Search, Warehouse, SQL, etc.)
• Customizable Ingestion Conformance-tier Processors
• Customizable Ingestion Stage Processors
• Customizable Ingestion Archive Processors
• Customizable Ingestion Optimised Format (e.g., Parquet/ORC) Processors
• Customizable Transformation/Enrichment Processors
• Customizable Microservices & API Processors
• Standardized Query API (e.g., J/ODBC) Data Access Interfaces
• Standardized Export API, Bucket Endpoint & Notify Data Access Interfaces
• Standardized Streaming Egress Data Access Interface
• RBAC (Role-Based Access Controls)
• Secrets/Vault capabilities
• TLS (in-flight) & TDE (at-rest) Encryption capabilities
• Operational Metrics & Monitoring capabilities
• Data Platform Resource Utilisation Tracking & Cost Allocation capabilities
• DLM, Archive, Retention capabilities
• Schema Registry capabilities
• Login & Data Platform Portal Home UI
• Analytics Notebook UI (e.g., Jupiter)
• Data Catalog UI
• VDI Workbench Portal UI
• Job Management UI
• Administration UI