Site Reliability Engineer

Job Category: Site Reliability Engineer
Job Type: Onsite
Job Location: California San Jose
Compensation: Depends on Experience
W2: W2-Contract Only; Kindly note that applications on a C2C basis will not be considered for this role.

Job Description:

We are looking for a talented Site Reliability Engineer (SRE) with a strong background in Google Cloud Platform (Google Cloud Platform), and RedHat OpenShift administration. The ideal candidate will be responsible for ensuring the reliability, performance, and scalability of our on-premise and cloud-based systems along with focus on reducing costs for Google Cloud.

  • System Reliability:Ensure the reliability and uptime of critical services and infrastructure.
  • Google Cloud Expertise:Design, implement, and manage cloud infrastructure using Google Cloud services.
  • Automation:Develop and maintain automation scripts and tools to improve system efficiency and reduce manual intervention.
  • Monitoring and Incident Response:Implement monitoring solutions and respond to incidents to minimize downtime and ensure quick recovery.
  • Collaboration:Work closely with development and operations teams to improve system reliability and performance.
  • Capacity Planning:Conduct capacity planning and performance tuning to ensure systems can handle future growth.
  • Documentation:Create and maintain comprehensive documentation for system configurations, processes, and procedures.

Qualifications:

  • Education: Bachelor’s degree in computer science, Engineering, or a related field.
  • Experience: 10+ years of experience in site reliability engineering or a similar role.

Skills:

  • Proficiency in Google Cloud services (Compute Engine, Kubernetes Engine, Cloud Storage, BigQuery, Pub/Sub, etc.).
  • Familiarity with Google BI and AI/ML tools (Looker, BigQuery ML, Vertex AI, etc.)
  • Experience with automation tools (Terraform, Ansible, Puppet).
  • Familiarity with CI/CD pipelines and tools (Azure pipelines Jenkins, GitLab CI, etc.).
  • Strong scripting skills (Python, Bash, etc.).
  • Knowledge of networking concepts and protocols.
  • Experience with monitoring tools (Prometheus, Grafana, etc.).

Preferred Certifications:

  • Google Cloud Professional DevOps Engineer
  • Google Cloud Professional Cloud Architect
  • Red Hat Certified Engineer (RHCE) or similar Linux certification