Infrastructure Engineer
About the Role
You will operate and improve production infrastructure, manage incidents and participate in on-call rotation, and design and execute scalable solutions across multi-cloud Kubernetes environments. You will automate repetitive processes, improve monitoring and alerting, document systems thoroughly, and drive reliability and performance improvements.
Requirements
- Experience operating mission-critical services and owning reliability/uptime/SLA
- Certified Kubernetes Administrator (CKA)
- Americas time zone availability
- Production experience with containers and orchestration platforms such as Kubernetes or Nomad
- Infrastructure automation tools such as Helm, Terraform, Terragrunt, or Ansible
- Monitoring solutions such as Grafana, Prometheus, VictoriaMetrics, or BetterUptime
- Public cloud experience with providers such as AWS, GCP, or Azure
- Programming skills in Python and/or Go
- RDBMS experience
- Proficient in Linux and shell
- Strong systems thinking including edge cases and failure modes
- Enthusiastic attitude and habit of documenting work
- Strong interest in blockchain technologies
Responsibilities
- Maintain and improve reliability and scalability of services
- Operate and manage multi-cloud Kubernetes clusters using Infrastructure as Code
- Manage incidents and participate in on-call rotation
- Improve monitoring and alerting
- Automate repetitive processes to reduce workload
- Plan, design, and execute infrastructure solutions
- Identify scalability bottlenecks and drive long-term resolution
- Improve and maintain documentation
- Share domain knowledge with team members
Benefits
- Stock options
- Flexible schedule
- Paid weekend on-call per month
