Job Description
Job Description:
We are seeking a Senior DevOps / Cloud Operations & Security Monitoring Engineer with 10+ years of experience in cloud infrastructure, operations, and security monitoring. This role is responsible for ensuring the reliability, performance, security, and cost-efficiency of AWS cloud environments while driving automation, observability, and incident response excellence.
The ideal candidate will bring deep hands-on experience in AWS operations, cloud security monitoring, infrastructure-as-code, and automation practices.
Key Responsibilities
Cloud Operations
- Administer, configure, and support AWS infrastructure (EC2, S3, VPC, IAM, RDS, Lambda, etc.).
- Monitor system health, performance, and availability across cloud workloads.
- Implement and manage Infrastructure-as-Code (Terraform, CloudFormation).
- Troubleshoot outages, service disruptions, and performance issues.
- Optimize cloud resource utilization and manage cost-efficiency initiatives.
- Maintain backup strategies, disaster recovery planning, and high-availability architecture.
- Support CI/CD pipelines and automated deployment workflows.
Security Monitoring & Incident Response
- Monitor AWS security logs, telemetry, and alerts (CloudTrail, GuardDuty, Security Hub, etc.).
- Investigate security incidents and coordinate remediation efforts.
- Maintain and tune detection rules, monitoring dashboards, and alert thresholds.
- Support vulnerability management, patching, and remediation processes.
- Assist in audit readiness and compliance evidence collection.
- Strengthen cloud security posture through collaboration with security teams.
Automation & Observability
- Develop automation scripts using Python, PowerShell, or Bash.
- Implement monitoring and alerting solutions (CloudWatch, Datadog, Splunk, etc.).
- Build dashboards for operational health and security reporting.
- Create runbooks and incident response documentation.
- Participate in on-call rotations and post-incident reviews.
- Identify recurring issues and implement long-term corrective solutions.
Required Qualifications
- 10+ years of IT experience with strong cloud operations background.
- 5+ years of hands-on experience in AWS environments.
- Experience with cloud monitoring, logging, and alerting tools.
- Strong understanding of networking, IAM, and system administration.
- Experience handling production incidents and security alerts.
- Hands-on scripting experience (Python, PowerShell, or Bash).
- Strong analytical and documentation skills.
Preferred Qualifications
- Experience with SIEM or threat detection platforms.
- Hands-on Infrastructure-as-Code (Terraform, CloudFormation).
- Knowledge of vulnerability scanning and remediation.
- Familiarity with compliance frameworks (NIST, CIS, ISO 27001).
- AWS Certifications (Solutions Architect, Security Specialty).
- Security+ or equivalent certification.
- Experience hardening AWS environments for enterprise standards.