Site Reliability Engineer

San Diego Full-time $110,000 - $145,000 / Year
($9,166 - $12,083 / Month)

Job Description

We are seeking a Site Reliability Engineer to enhance our cloud infrastructure and ensure high availability of our applications. The successful candidate will work closely with development teams to implement and maintain monitoring, automation, and incident response strategies.

Responsibilities

  • Lead initiatives to improve overall system availability and performance.
  • Evaluate and integrate new tools and technologies to enhance infrastructure.
  • Conduct post-mortem analysis on outages and incidents.
  • Guide and mentor junior team members in SRE practices.
  • Foster a culture of reliability and operational excellence across teams.

Requirements

Education
  • Bachelor's degree in Computer Science or related field
  • Master's degree in a relevant field is preferred
Experience
  • 3+ years of experience in cloud-based environments
Technical Skills
  • Docker
  • Prometheus
Soft Skills
  • Team Leadership
  • Adaptability
Certifications
  • AWS Certified Solutions Architect
  • Microsoft Azure Administrator
Languages
  • English: Fluent

Advantageous

  • Familiarity with Python: Knowledge of Python for scripting and automation tasks.
  • Experience with monitoring tools: Hands-on experience with tools like Grafana and Datadog.

Benefits

  • Full health coverage for employees
  • 401(k) plan with company matching contributions
  • Flexible scheduling and remote work options
  • Investment in team training and growth

Company Culture

  • Innovation Orientation: We encourage innovative thinking and experimentation.
  • Growth Mindset: We support personal and professional growth for all team members.
  • Respectful Workplace: We are committed to creating a respectful and supportive environment.
Status: Closed