Site Reliability Engineer

Portland FULL TIME $110,000 - $140,000 / Year
($9,166 - $11,667 / Month)

Job Description

We are seeking a skilled Site Reliability Engineer to ensure our systems are reliable and scalable. As part of our dynamic team, you will implement monitoring solutions, troubleshoot performance issues, and automate operational tasks. The ideal candidate will have a passion for infrastructure and a strong understanding of DevOps practices.

Responsibilities

  • Collaborate with cloud service providers to manage infrastructure resources.
  • Proactively identify potential reliability issues and work on preventive solutions.
  • Create scripts for automation and orchestration tasks.
  • Manage configuration changes to systems across the cloud infrastructure.
  • Document processes and operational improvements for team knowledge.
  • Lead post-mortem meetings to discuss failures and outages.

Requirements

Education
  • Bachelor's degree in Information Technology or related field
  • Master's degree is preferred
Experience
  • 5+ years in DevOps or Site Reliability roles
Technical Skills
  • Containers (Docker, Kubernetes)
  • Monitoring Tools (Prometheus, Grafana)
Soft Skills
  • Problem-Solving
  • Communication
Certifications
  • Google Professional Cloud Architect
  • Certified Jenkins Engineer
Languages
  • English: Fluent

Advantageous

  • Knowledge of Infrastructure as Code (IaC) tools: Experience using tools like Terraform or CloudFormation for resource management.
  • Experience with performance tuning and optimization: Hands-on with techniques for improving application and system performance.

Benefits

  • Health, dental, and vision coverage
  • 401(k) with matching contributions
  • Flexible schedules and remote working possibilities
  • Opportunities for professional development and training

Company Culture

  • Inclusivity: We celebrate diversity and are committed to fostering an inclusive workplace for everyone.
  • Transparency: We maintain open communication and transparency in all operations and decisions.
  • Work-Life Balance: We support a healthy work-life balance, understanding that personal time is important.
Status: Open