Site Reliability Engineer

About the Role

You will co-own production services and ensure they are reliable and scalable. You will support delivery of new features and maintain day-to-day operations. You will collaborate with software engineering to drive operational improvements using metrics, automate repetitive tasks, maintain CI/CD pipelines, implement monitoring, and participate in on-call rotations to investigate and resolve incidents. You will manage and scale infrastructure and build tooling to improve operational efficiency.

Requirements

  • Extensive experience deploying, managing and troubleshooting infrastructure in AWS
  • Managed full lifecycle of deploying containers to production using self-managed Kubernetes, ECS, or EKS
  • Experience building custom tooling when needed
  • Understanding of CI/CD and experience building deployment tooling
  • Ability to solve problems in distributed Linux systems and trace requests across applications, systems and networks
  • CKS certification (Kubernetes security)
  • Ability to automate routine tasks and proficiency in at least two programming languages
  • Excellent spoken and written communication skills
  • Ability to collaborate and also solve problems independently

Responsibilities

  • Co-own production services and ensure reliability and scalability
  • Deliver new features and manage day-to-day operations of services
  • Collaborate with software engineering to identify and drive operational improvements using metrics
  • Develop and maintain application performance benchmarks
  • Drive operational efficiencies in release processes and monitoring
  • Build and maintain tooling and automation for operations
  • Maintain and improve CI/CD pipelines
  • Participate in weekly on-call rotation to investigate and resolve system issues
  • Manage and scale infrastructure systems

Benefits

  • Healthcare Insurance (Zero Hash covers roughly 100% of employee premiums, U.S. Only)
  • Chance to earn equity
  • Vision Insurance (U.S. Only)
  • Dental Insurance (U.S. Only)
  • Maternity & Paternity leave
  • Visa sponsorship
  • 401k (U.S. Only)

Skills

Apply Now
Site Reliability Engineer at Zero Hash LLC | JobStash