Cloud Ops Architect with skills Cloud Service Incident Management, AWS - EKS, AWS - CloudFormation, Lambda Test, ITSM Principles, Cloud Service Incident Management, Cloud Service Manager, Infrastructure Support, AWS-Infra, Infrastructure Support L1, DevOps Engineering, CI/CD, AWS CloudTrail for location Pune, India
ROLES & RESPONSIBILITIES
Key Responsibilities
Leadership & Team Management
Lead a team of AWS cloud engineers and operations professionals, ensuring continuous improvement in technical expertise and customer satisfaction.
Develop and maintain an cloud management operational model that aligns with industry best practices and customer requirements.
Foster collaboration between engineering and operations teams to drive efficient, scalable, and secure cloud solutions.
Act as an escalation point for high-priority incidents and ensure timely resolution.
Operations Management
Own and manage the delivery of managed services, ensuring adherence to SLAs, KPIs, and customer expectations.
Design and implement monitoring, alerting, and incident management processes for multi-account, multi-region AWS environments.
Optimize cloud operational costs by implementing governance, resource optimization, and automation practices.
Oversee capacity planning, patch management, and proactive maintenance schedules.
Engineering & Innovation
Drive automation in operational workflows using Infrastructure as Code (IaC), CI/CD pipelines, and AIOps tools.
Collaborate with the Cloud Transformation team to implement scalable and future-ready AWS architectures.
Provide thought leadership in AWS technologies, keeping the team ahead of emerging trends and updates.
Standardize and document operational processes, ensuring compliance with industry and regulatory standards.
Customer Success & Stakeholder Management
Engage with customers to understand business requirements, pain points, and growth objectives.
Prepare regular operational and performance reports for internal and customer stakeholders.
Act as a trusted advisor for customers, recommending technical and operational improvements tailored to their needs.
Key Qualifications
Technical Expertise (Must Have)
Core AWS Services Expertise: Extensive experience with foundational AWS services including EC2, S3, VPC, RDS, Lambda, and CloudFront, with a focus on design, deployment, and optimization.
Networking and Security: Deep understanding of AWS networking components such as VPC, Transit Gateway, Route 53, Direct Connect, and VPN. Expertise in implementing security best practices, including IAM policies, security groups, AWS WAF, GuardDuty, and AWS Security Hub.
Monitoring and Logging: Hands-on experience with AWS-native tools like CloudWatch, CloudTrail, and third-party solutions such as Datadog or Splunk to ensure system reliability and performance monitoring.
Cost Management and Optimization: Skilled in using AWS Cost Explorer, Trusted Advisor, and AWS Budgets to monitor, manage, and optimize cloud costs effectively.
Storage and Data Management: Proficiency in AWS storage services such as EBS, S3, and Glacier, as well as database solutions like Aurora, DynamoDB, and Redshift for diverse data needs.
Technical Expertise (Nice to have)
Infrastructure as Code (IaC): Proficient in using Terraform, AWS CloudFormation, or CDK to automate infrastructure deployment, ensuring consistency and scalability.
Operational Excellence: Experience in implementing and managing DevOps pipelines with AWS CodePipeline, CodeBuild, and third-party CI/CD tools.
Containers and Serverless: Knowledge of deploying and managing containerized applications using ECS, EKS, or Fargate, and leveraging serverless architectures with Lambda and API Gateway.
High Availability and Disaster Recovery: Proven ability to architect highly available, fault-tolerant systems using AWS services such as Auto Scaling Groups, Elastic Load Balancers, Multi-AZ deployments, and Route 53 for failover.
Professional Experience
AWS Cloud Operations: A minimum of 10+ years of hands-on experience managing AWS cloud infrastructure, with a proven track record in running production environments for enterprise or mid-sized organizations.
Technical Leadership: At least 3-5 years in a leadership role overseeing cross-functional teams, managing day-to-day operations, and driving technical projects to completion.
Managed Services Delivery: Demonstrated experience providing managed cloud operations to external customers, including SLA management, incident resolution, and delivering operational excellence.
Incident and Problem Management: Expertise in managing high-priority incidents, root cause analysis, and implementing permanent solutions to prevent recurrence.
Cloud Migration and Modernization: Experience leading cloud migration projects, including re-hosting, re-platforming, and refactoring applications for the AWS cloud.
Operational Automation: Hands-on experience implementing automation in workflows, monitoring, and operational tasks using AWS-native tools, scripts, or third-party platforms.
Cost Optimization: Experience in managing and reducing cloud infrastructure costs while balancing performance and scalability.
Leadership Experience
Proven ability to lead and manage cross-functional technical teams.
Experience with managed operations, incident management, and customer engagement.
Excellent problem-solving skills and the ability to make decisions under pressure.
Preferred Qualifications
AWS certifications (e.g., AWS Certified Solutions Architect, AWS Certified DevOps Engineer, AWS SysOps Administrator).
Familiarity with multi-cloud environments and integration scenarios.
Knowledge of AIOps tools and practices.
EXPERIENCE
- 12-14 Years
SKILLS
- Primary Skill: Cloud Service Incident Management
- Sub Skill(s): Cloud Service Incident Management
- Additional Skill(s): AWS - EKS, AWS - CloudFormation, Lambda Test, ITSM Principles, Cloud Service Incident Management, Cloud Service Manager, Infrastructure Support, AWS-Infra, Infrastructure Support L1, DevOps Engineering, CI/CD, AWS CloudTrail
ABOUT THE COMPANY
Infogain is a human-centered digital platform and software engineering company based out of Silicon Valley. We engineer business outcomes for Fortune 500 companies and digital natives in the technology, healthcare, insurance, travel, telecom, and retail & CPG industries using technologies such as cloud, microservices, automation, IoT, and artificial intelligence. We accelerate experience-led transformation in the delivery of digital platforms. Infogain is also a Microsoft (NASDAQ: MSFT) Gold Partner and Azure Expert Managed Services Provider (MSP).
Infogain, an Apax Funds portfolio company, has offices in California, Washington, Texas, the UK, the UAE, and Singapore, with delivery centers in Seattle, Houston, Austin, Kraków, Noida, Gurgaon, Mumbai, Pune, and Bengaluru.