Site Reliability Engineer

 UKG (Ultimate Kronos Group)
 Weston, FL
 3 years ago
 None

UKG is seeking a Site Reliability Engineer (SRE) with a robust and diverse background in Software Engineering, Software Design, and Systems Architecture with a focus on automation, reliability, and system integration. Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. An SRE ensures that UKG’s services—both our internally critical and our externally visible systems—have reliability and uptime appropriate to users' needs and a fast rate of improvement while keeping an ever-watchful eye on capacity and performance. At UKG, our Site Reliability Engineers (SRE) come from both development and operations backgrounds with a common passion for running products at scale in production. Our SRE engineers are always seeking to understand how our systems work end to end without boundaries.


Primary/Essential Duties and Key Responsibilities:



  • Engage in and improve the whole lifecycle of services from conception to inception, including: system design, build, and deployment

  • Define and implement standards and best practices related to: System Architecture, Deployment, metrics, operational tasks

  • Support services through activities such as monitoring availability, system health, and incident response

  • Improve system performance, application delivery, and efficiency through automation, process refinement, post mortem reviews, and in-depth configuration analysis

  • Engage in Communications across all areas of the organization


Required Qualifications:



  • Knowledge of resilient systems as well as anti-fragility design patterns

  • Knowledge of distributed systems

  • Knowledge of service-oriented architectures

  • Knowledge of microservice architectures

  • Experience in one or more of the following: Python, Go, Angular, .Net Core (C#), JAVA, Node.js

  • Experience with Unix/Linux operating systems internals and administration (e.g., filesystems, inodes, system calls) and networking (e.g., TCP/IP, routing, network topologies).

  • Experience with containerization, such as Kubernetes, bosh, docker

  • Experience with Configuration Management (Puppet/Chef/Ansible)

  • Ability to adapt quickly to changing priorities

  • Ability and willingness to work evenings / nights on occasion (Participate in on-call rotation)


Preferred Qualifications:



  • Experience with algorithms, data structures, complexity analysis, and software design.

  • Experience with OpenStack

  • Experience with administration of ElasticSearch, MySQL, MongoDB, RabbitMQ, Redis, in production environment a PLUS

  • Experience with Amazon Web Services or Google Cloud Platform Products

  • Exposure to writing SQL scripts preferred

  • Technical writing

  • Communication

  • Auditing

  • Development Background

  • Gremlin/Chaos Engineering Tools


Check out how we give our employees the chance to work on whatever project they want for 48 hours! https://youtu.be/2Aw55CP1IO8


Typical Interview Process:



  • If your application is selected, a Talent Acquisition Team Member will reach out to schedule a phone screen with them.

  • If selected to move forward, you will complete a HackerRank Coding Assessment.

  • If you pass, you will either move forward to a technical phone call for an additional screening, OR directly to an onsite interview.

  • Offer stage.

U

Share

 Twitter

Or view all jobs:

  • JS Remotely
  • PHP Remotely
  • Java Remotely
  • Python Remotely
  • Ruby Remotely
  • Designer Remotely
  • Develops Remotely
  • DotNet Remotely
  • GoLang Remotely

DevOps Remotely

We are a part of the Go Remotely group, specialized in recruiting anywhere in the world.

JS Remotely

PHP Remotely

Java Remotely

Python Remotely

Ruby Remotely

Designer Remotely

Develops Remotely

DotNet Remotely

GoLang Remotely

Copyright © 2024 DevOps Remotely | Privacy policy