Description:
The Principal I of DevOps Engineering acts as a technical expert, focusing on providing expertise, guidance, and support on solutions requiring deep technical experience in public cloud (GCP), Kubernetes, and Terraform.
Detailed Responsibilities/Duties:
- Manage and administer the Google cloud environment, including provisioning, configuration, performance monitoring, policy governance and security
- Design, develop, and implement highly available, multi-region solutions within Google cloud
- Analyze existing operational standards, processes, and/or governance to identify and implement improvements
- Migrate existing infrastructure services to cloud-based solutions
- Develop infrastructure as code (IaC) leveraging Terraform to ensure automated and consistent platform deployments
- Infrastructure as Code (IaC): Develop, maintain, and optimize infrastructure code using Terraform Cloud to provision and manage cloud resources.
- Experience creating and managing production scale Kubernetes clusters (GKE) and implementing best practices.
- Support our Kubernetes-based projects to resolve critical and complex technical issues.
- Design, build, and run elastic, cost-effective, resilient, robust, and secure architectures in the cloud using modern approaches like service mesh and loosely coupled design
- Maintain, configure, and monitor containers using Infrastructure as Code principles in Development, Test, and Live environments
- Implement Continuous Integration, Delivery, and Deployment using different CI/CD tools such as cloud build, cloud deploy, Argo CD, and Azure DevOps.
- Deploy and configure Kubernetes clusters especially GKE using best practices.
- Monitor cluster health and performance.
- Troubleshoot and resolve cluster issues.
- Implement security best practices, including RBAC (Role-Based Access Control).
- Manage SSL certificates and encryption settings.
- Monitor and optimize resource usage within the clusters.
- Scale clusters to meet application requirements.
Skills Required:
- GCP experience involving design, deployment, configuration, and optimization
- Expertise in IAC development with Terraform
- Experience with IAAS and PAAS solutions
- Proficient with GIT to perform source code management
- Experience with Terraform Cloud and CICD tooling
- Experience with Kubernetes administration (GKE)
- Experience with microservices architecture
- Knowledge of Linux
- Understanding of observability, what are the key metrics to monitor, and how to establish a monitoring foundation.
- Proficient in a wide variety of Cloud Native Services, and to an extent able to consult others.
Certificates / Training:
• Certified Kubernetes Administrator (CKA) and/or Certified Kubernetes Application Developer (CKAD) is a plus.
• Azure/GCP/Hashicorp Certification is a plus.
Experience Level I:
• 6 + years experience in DevOps/SRE with deep expertise in one area
• Experience with running Kubernetes clusters in production
• Essential experience working with GCP
Education Required:
• Bachelor's degree in Computer Science or related field.