Role Overview: 

We seek a top-tier Senior Cloud Engineer (AWS) to lead strategic cloud operations, drive innovation, and shape product direction. Only candidates with proven expertise in large-scale, high-spend AWS environments will be considered. 

Key Responsibilities: 

Cloud Operations & Architecture: 
  • Design, secure, and optimize cloud infrastructure handling $1M+/year spend across 100+ AWS accounts. 
  • Automate operations using Python, CloudFormation, and Systems Manager to enable proactive scaling and cost-triggered remediations. 
  • Solve complex failures in distributed systems using CloudWatch, X-Ray, and native tooling. 
  • Implement measurable cost optimizations (e.g., "Achieved 35% savings via rightsizing/Spot adoption"). 
  • Enforce least-privilege security, encryption, and compliance via automated audits. 
  • Maintain >99.9% uptime through rigorous incident management and chaos engineering. 
 
Research & Product Leadership: 
  • Lead research initiatives and POCs to explore new technologies and methodologies. 
  • Provide opinionated insights and recommendations to influence product strategy and roadmap. 
  • Collaborate with cross-functional teams to translate research findings into actionable product features. 
 
Generative AI & Emerging Technologies: 
  • Design and implement generative AI applications using Amazon Bedrock, leveraging foundation models from providers like Anthropic and AI21 Labs. 
  • Develop and manage agentic AI workflows utilizing Amazon Bedrock Agents, including custom orchestrators for complex task automation. 
  • Integrate Model Context Protocol (MCP) servers to enhance AI capabilities with domain-specific knowledge and tool access. 
  • Collaborate with cross-functional teams to translate research findings into actionable product features. 
  • Stay abreast of industry trends and emerging technologies to inform product development 
 
Desired Skills

Must have:

  • Must have worked at an AWS-certified MSP or Enterprise Cloud Center of Excellence (CCOE) serving multiple internal teams. 
  • Must be hands-on in supporting production-grade workloads across 100+ AWS accounts with >$100K/month cloud spend. 
  • Must have performed 5+ customer assessments such as Formal Technical Reviews (FTRs)/Well-Architected Framework Reviews (WAFRs) with documented optimization results. 
  • Must have built proactive/reactive automations for cost, security, and compliance. 
  • Must have used AWS-native FinOps/SecOps tools (Security Hub, Config, Cost Explorer) or third-party equivalents.  
  • Proven experience in AWS cost optimization and financial operations (FinOps). 
  • Strong proficiency in AWS services such as EC2, S3, RDS, Lambda, Bedrock etc. 
  • Experience with AWS Cost Explorer, Budgets, and other cost management tools. 
  • Proficiency in scripting languages like Python or Bash for automation purposes. 
  • Strong analytical and problem-solving skills. 
  • Excellent communication and collaboration abilities. 
  • AWS certifications such as AWS Certified Solutions Architect or AWS Certified AI Practitioner are preferred. 
 
 
Preferred Skills: 

  • Experience with FinOps tools like CloudHealth, Apptio Cloudability, or similar. 
  • Knowledge of cloud governance and compliance standards. 
  • Familiarity with DevOps practices and CI/CD pipelines. 
  • Experience in leading research initiatives and providing strategic product insights. 

Experience:

  • 8+ years of experience in cloud operations, with a strong focus on AWS services.
  • Hands-on experience designing and operating production-grade infrastructure on AWS.
  • Proficient in writing and managing AWS CloudFormation templates with practical experience.
  • Hands-on experience using Amazon CloudWatch for monitoring, alerting, and dashboard configuration.
  • Performed independent root cause analysis and troubleshooting in cloud production environments.
  • Managed EC2 lifecycle, patching, and configuration using AWS Systems Manager in production setups.
  • Ensured cloud security using IAM policies, encryption standards, and audit mechanisms with best practices.
  • Worked directly with product or customer teams to translate business needs into scalable technical solutions.
  • Led operational readiness and incident management in critical environments with proven examples.
  • Experience in designing or deploying generative AI applications using Amazon Bedrock (preferred).
  • Experience working with Model Context Protocol (MCP) servers or similar AI orchestration frameworks (preferred).

Education

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field.