Lead Site Reliability Engineer

 Evidation Health
 3 years ago
 None

**Considering candidates in San Mateo, Santa Barbara, or US-based remote**


We are looking for a Lead Site Reliability Engineer passionate about highly available systems and the processes it takes to get there. You will work with multiple teams at Envidation and be responsible for how code is deployed, configured, and monitored, as well as the availability, latency, change management, emergency response, and capacity management of services in production. Experience with cloud architecture, cloud security, continuous integration, continuous delivery, infrastructure as code, and a strong operational background are all a must. More than a set of skills, we are looking for someone who is curious, collaborative, great at communicating, and is always willing to learn. If you are ready to help build a great SRE organization this is the perfect opportunity.


RESPONSIBILITIES



  • Partner our engineering teams to properly manage and respond to production issues.

  • Ensuring that proper logging, monitoring and alerting is set up.

  • Working with teams when incidents happen and making sure we fix issues in a timely manner as well as understand the root cause and drive action items so they don't happen again.

  • Work with each team on their disaster recovery plans including leading tabletop exercises

  • Partner with DevOps, Test Automation, IT, Engineers, Project Managers, Quality and leadership to understand where the opportunities for improvement are


QUALIFICATIONS


Minimum Qualifications:



  • Experience in site reliability or devops

  • Experience in leading and building out an SRE function

  • Hands on experience managing / supporting Linux production environments

  • Experience with AWS

  • Experience with Incident Management including

  • Experience with Kubernetes

  • Experience with CICD tools

  • Strong written and verbal communication, including ability to quickly synthesize and analyze inputs from a variety of sources.


Preferred Qualifications:



  • Experience with AWS EKS, ECS, Faregate

  • Experience with the Atlaissian stack (JIRA, Confluence, OpsGenie)

  • Experience with DataDog

E

Share

 Twitter

Or view all jobs:

  • JS Remotely
  • PHP Remotely
  • Java Remotely
  • Python Remotely
  • Ruby Remotely
  • Designer Remotely
  • Develops Remotely
  • DotNet Remotely
  • GoLang Remotely

DevOps Remotely

We are a part of the Go Remotely group, specialized in recruiting anywhere in the world.

JS Remotely

PHP Remotely

Java Remotely

Python Remotely

Ruby Remotely

Designer Remotely

Develops Remotely

DotNet Remotely

GoLang Remotely

Copyright © 2024 DevOps Remotely | Privacy policy