Apply for Job
Principal Data Engineer
Petaling Jaya, MY
Role Mission:
To provide technical leadership within StarHub’s Digital Experience Platform (DXP) Data organization by designing, delivering, and operationalizing complex data pipelines, curated datasets, and reusable engineering patterns on the cloud-native data platform. This role drives technical excellence across data ingestion, transformation, modeling, DataOps, and production reliability to enable trusted, scalable, and self-service analytics across business domains.
Accountabilities:
-
Own technical delivery of complex, high-impact data engineering initiatives across ingestion, transformation, modeling, and operational stabilization.
-
Serve as the senior technical leader within the Data Engineering function, setting implementation direction, reviewing design quality, and uplifting engineering standards across the team.
-
Drive production reliability, observability, and root-cause elimination for critical pipelines and datasets.
-
Develop reusable engineering patterns, frameworks, and automation to improve delivery speed, quality, and maintainability.
-
Partner with Data Architecture, Platform Engineering, Data Quality Stewards, BI, and business stakeholders to translate requirements into trusted and scalable data products.
-
Coach and mentor engineers through design reviews, code reviews, troubleshooting, and day-to-day technical guidance without direct people management responsibility.
Responsibilities:
-
Technical Delivery & Solution Design: Lead design and implementation of complex ingestion, transformation, and curated data model solutions across Datapipe, Snowflake, and AWS, ensuring scalable, reusable, and cost-efficient patterns.
-
Engineering Standards & Quality: Establish and enforce practical engineering standards across SQL, Python, DAG design, CI/CD, testing, observability, RBAC-aware implementation, and cost-aware design.
-
Operational Excellence: Own production stability for critical pipelines and datasets, including incident triage, recovery leadership, RCA, and preventative improvement actions.
-
Reusable Enablement: Build reusable components, templates, runbooks, and agentic delivery patterns to reduce duplicated effort, improve maintainability, and raise engineering velocity.
-
Data Quality & Trusted Data: Embed automated data quality controls into pipelines and curated layers, including validation, anomaly detection, reconciliation, and schema drift checks.
-
Collaboration & Enablement: Work with architects, stewards, platform engineers, BI teams, and business stakeholders to shape requirements into implementable data contracts and trusted datasets for self-service analytics.
-
Technical Leadership by Influence: Act as the senior technical escalation point for difficult engineering and production issues, while coaching Senior Data Engineers and Data Engineers through design and implementation guidance.
Team Scope/ Stakeholders:
-
Scope: Complex pipelines, curated datasets, reusable engineering patterns, and production reliability across the DXP Data Platform (C360, Datapipe ingestion solution based on Apache Airbyte & Airflow, Snowflake, SageMaker, Cloud native skills).
-
Decision Rights: Technical design decisions within assigned initiatives, implementation patterns, code quality expectations, incident recovery actions, and recommendations on engineering prioritization and standards uplift.
-
Stakeholders: Data Engineering, Platform Engineering, Architecture & Governance, BI, Data Science, Data Quality Stewards, Business Data Owners, Infrastructure, Cybersecurity/ISO, and Application domain teams.
-
Resources: Individual contributor role operating as the senior-most hands-on engineer within the Data Engineering team, with responsibility to guide and uplift engineers across Singapore, Malaysia and India through technical leadership.
Minimum Profile/ Track Record:
-
7–10+ years of experience in cloud-native data engineering, with strong hands-on architecture, delivery, and production support experience on AWS & Snowflake.
-
Strong track record delivering complex data engineering initiatives independently, with the ability to operate across both build and run responsibilities.
-
Experience partnering with BI and business teams to design modelled datasets and enable self-service analytics.
-
Demonstrated technical leadership through design reviews, code reviews, mentoring, and troubleshooting guidance without formal team management responsibility.
-
Deep hands-on technical expertise, including:
-
Snowflake: schema design, Streams/Tasks, Stored Procedures, UDFs, RBAC-aware development, performance tuning, cost monitoring, Cortex AI, and Streamlit.
-
Airflow or similar data orchestration tools: DAG design, orchestration, scheduling, dependency management, retry patterns, and observability.
-
Python and SQL: pipeline scripting, transformation logic, data validation, and operational tooling.
-
ELT/ETL frameworks: Airbyte, Fivetran, and custom connector understanding or development.
-
AWS services: S3 (data lake structures and archival), Lambda, KMS, Transfer Family, CloudWatch, and SageMaker.
-
Demonstrated success delivering medallion architecture (Bronze/Silver/Gold) and enabling self-service data use cases.
-
Experience implementing automated data quality controls, remediation workflows, and data lineage-aware engineering practices across enterprise datasets.
-
Familiarity with machine learning or AI integration using platforms like AWS SageMaker.
-
Proven ability to troubleshoot complex data issues, lead root-cause analysis, and improve production stability through mechanisms rather than repeated manual intervention.
-
Track record of raising team engineering quality through reusable patterns, operational discipline, and technical coaching.