Apply for Job
Cloud DevOps Automation
Petaling Jaya, MY
Role Summary
The Cloud DevOps Automation Engineer is responsible for building and operating automation-first cloud platforms across AWS environments. This role unifies cloud infrastructure, DevOps practices, and scaffolding frameworks through robust automation to enable scalable, secure, and efficient service delivery.
Acting as the bridge between infrastructure, platform engineering, and developer experience, the engineer will design systems where automation is the primary control plane—powering infrastructure provisioning, EKS operations, CI/CD pipelines, and self-service service onboarding.
The role emphasizes eliminating manual operations, standardizing workflows, and enabling repeatable, production-grade deployments using tools such as CloudFormation, Terraform, AWS AFT, CFCT, Kubernetes (EKS), and Backstage.
Accountabilities
- Design and implement end-to-end automation frameworks for AWS infrastructure, EKS platforms, and application delivery.
- Own and operate AWS multi-account environments using AFT and CFCT, driven by automation.
- Build and maintain automated CI/CD pipelines and GitOps workflows.
- Develop reusable scaffolds and templates that are fully automation-enabled.
- Ensure all infrastructure, deployments, and operations are:
- Automated
- Repeatable
- Version-controlled and auditable
- Drive operational excellence through automation of monitoring, recovery, and scaling.
- Enable self-service capabilities via Backstage and automation pipelines.
- Reduce manual intervention and improve deployment speed, consistency, and reliability.
Responsibilities
1. Cloud Infrastructure Automation
- Design and automate AWS infrastructure provisioning:
- VPCs, EC2s, EKS, subnets, routing, and security groups
- Implement automation-driven multi-account management using:
- AWS Control Tower, AFT, and CFCT
- Build and maintain:
- AFT pipelines for account provisioning and lifecycle automation
- CFCT pipelines for automated baseline configuration and governance
- Automate:
- Account bootstrapping (IAM, logging, security baselines)
- Environment provisioning across dev, staging, and production
- Ensure infrastructure is deployed and managed exclusively via IaC and automation pipelines
2. Kubernetes & EKS Automation
- Automate lifecycle management of Amazon EKS clusters
- Implement automated deployment patterns using:
- Helm and GitOps (ArgoCD/Flux)
- Automate Kubernetes operations:
- Workload deployments
- Scaling policies
- Configuration and secrets management
- Build automated workflows for:
- Cluster upgrades and patching
- Performance tuning and optimization
- Failure recovery and resilience
3. Infrastructure as Code & Automation Frameworks
- Develop and maintain IaC-driven automation frameworks using:
- Terraform and AWS CloudFormation
- Create reusable automation modules:
- Terraform modules for infrastructure and AFT customization
- CFCT configuration packages for governance enforcement
- Integrate IaC into CI/CD pipelines for fully automated provisioning
- Enforce:
- Version control
- Auditability
- Security and compliance through automation
4. DevOps Automation & CI/CD
- Design and implement fully automated CI/CD pipelines using Bitbucket Pipelines
- Automate:
- Build, test, security scanning, and deployment workflows
- Containerize applications using Docker and manage images in Amazon ECR
- Integrate:
- SAST, DAST, and container security scanning
- IaC validation and policy enforcement
- Enable zero-touch or low-touch deployments for engineering teams
5. Scaffolding & Self-Service Enablement (Secondary Focus)
- Develop reusable, automation-driven scaffolds:
- Infrastructure templates (Terraform/CloudFormation)
- Kubernetes deployment templates (Helm)
- Application boilerplates
- Integrate scaffolds into Backstage Software Templates
- Ensure scaffolds:
- Automatically provision infrastructure
- Configure pipelines
- Deploy services with minimal manual steps
- Align scaffolds with:
- AFT-provisioned accounts
- CFCT-enforced baselines
- Promote standardized golden paths powered by automation
6. Observability, Security & Compliance Automation
- Automate observability:
- Logging, metrics, and alerting (CloudWatch, Splunk, Prometheus)
- Embed security via automation:
- IAM provisioning and least privilege enforcement
- IRSA for Kubernetes workloads
- Secrets management (AWS Secrets Manager, SSM)
- Implement policy-as-code and automated compliance checks
- Ensure all environments are continuously monitored and compliant by default
7. Cloud Operations & Reliability Engineering
- Operate AWS environments with a focus on automation-driven operations
- Automate:
- Incident detection and alerting
- Recovery workflows where possible
- Perform advanced troubleshooting:
- Networking (BGP, routing, hybrid connectivity)
- Kubernetes and application-level issues
- Participate in:
- On-call rotations
- Incident response and root cause analysis
- Continuously improve reliability by eliminating repetitive manual tasks through automation
Key Skills & Experience
- Strong experience with AWS (EKS, IAM, VPC, Control Tower)
- Proven expertise in:
- AWS Account Factory for Terraform (AFT)
- Customizations for AWS Control Tower (CFCT)
- Deep hands-on experience with Kubernetes (EKS and Helm
- Advanced proficiency in Terraform and CloudFormation
- Strong experience in building automation frameworks and CI/CD pipelines
- Knowledge of AWS networking and hybrid connectivity (Direct Connect, VPN, BGP)
- Experience with Docker and Amazon ECR
- Solid Linux system administration and troubleshooting skills
- Familiarity with observability tools (CloudWatch, Splunk, Prometheus, Grafana)
- Experience with Backstage or developer platforms (preferred)
- Strong mindset toward automation-first engineering and operational efficiency