Senior Infrastructure Engineer, Core Systems
About the Role
You will architect, develop, and implement next-generation infrastructure platforms across private and public clouds and hybrid environments. You will build automation using Infrastructure-as-Code and configuration management, create and maintain infrastructure standards and documentation, improve observability, optimize resource allocation and costs, and run capacity planning and disaster recovery initiatives to ensure reliability and scalability.
Requirements
- Minimum 5 years of experience in Systems Administration, Datacenter Operations, or Infrastructure Engineering with deep expertise in Linux/Unix systems.
- Proven success designing and managing hybrid cloud and traditional infrastructure platforms at scale including bare-metal, virtualized, and cloud-native environments.
- Hands-on experience with automation, configuration management, and CI/CD tools such as Terraform, Ansible, Consul, and Jenkins.
- Proficiency with observability and monitoring platforms such as Grafana, ELK, VictoriaMetrics, and DataDog.
- Programming experience in one or more languages such as Python, Go, or JavaScript.
- Strong understanding of infrastructure architecture principles including scalability, redundancy, fault tolerance, and cost optimization.
- Familiarity with containerization (Docker, Kubernetes) and networking fundamentals across cloud and datacenter environments.
- Proactive analytical mindset with the ability to deliver under pressure and strong communication and documentation skills.
Responsibilities
- Research, architect, and deploy complex infrastructure systems across bare-metal servers, hypervisors, orchestrators, virtual machines, and containerized environments.
- Design and implement automation using Infrastructure-as-Code and configuration management principles (Terraform, Ansible, Consul) to ensure reproducibility, speed, and consistency in infrastructure deployment.
- Establish and maintain infrastructure standards, documentation, and architecture diagrams to support scale, reliability, and compliance across environments.
- Partner with Technical Operations, CloudOps, and Platform teams to ensure smooth integration of systems, improve deployment efficiency, and enhance observability.
- Optimize resource allocation and infrastructure cost efficiency while maintaining performance and uptime goals.
- Continuously improve resilience and scalability through proactive capacity planning, disaster recovery testing, and infrastructure modernization initiatives.
Benefits
- Eligible for a quarterly bonus tied to company and individual goal achievement
- Competitive benefit package in locations where the company operates
Skills
Capacity PlanningVirtualizationConfiguration ManagementFault ToleranceConsulVictoriametricsHypervisorCost-OptimizationInfrastructure-ArchitectureAnsibleGoScalabilityJenkinsLinuxBare-MetalAutomationContainerizationJavascriptTerraformElkMonitoringInfrastructure-As-CodeNetworkingObservabilityCi/CdGrafanaDatadogRedundancy
