Senior Infrastructure Engineer
Groq
Date: 1 week ago
City: Remote, Remote
Contract type: Full time
Remote
Mission: At Groq, we are building a custom cloud from the ground up - one data center at a time. Our Compute Storage team owns the systems that turn racks of bare metal into production-ready Kubernetes clusters powering the next generation of AI workloads.
We are looking for a Staff Infrastructure Engineer to help us scale this effort. This is a hands-on role focused on fully automating deployment and lifecycle management of the Groq Cloud server fleet. You will work closely with DC, network and platform teams to define and develop tools and automation that enable seamless deployment and management of Groq compute nodes and storage clusters. We're looking for someone passionate about infrastructure who enjoys debugging close to the metal. If you're eager to grow your skills in deploying, scaling, and optimizing bare metal to support complex distributed HPC in the expanding inference market – we would love to talk.
Responsibilities & Opportunities In This Role
Compensation: At Groq, a competitive base salary is part of our comprehensive compensation package, which includes equity and benefits. For this role, salary range is determined by your location, skills, qualifications, experience and internal benchmarks. Compensation for candidates outside the USA will be dependent on the local market.
We are looking for a Staff Infrastructure Engineer to help us scale this effort. This is a hands-on role focused on fully automating deployment and lifecycle management of the Groq Cloud server fleet. You will work closely with DC, network and platform teams to define and develop tools and automation that enable seamless deployment and management of Groq compute nodes and storage clusters. We're looking for someone passionate about infrastructure who enjoys debugging close to the metal. If you're eager to grow your skills in deploying, scaling, and optimizing bare metal to support complex distributed HPC in the expanding inference market – we would love to talk.
Responsibilities & Opportunities In This Role
- Develop robust, scalable automation solutions (Go, Python, Bash) to streamline and standardize deployment workflows across global data center environments.
- Be part of large cross-functional collaboration with data center operations, networking, and platform teams, ensuring infrastructure is fully integrated and production-ready.
- Develop automation to ensure all production machines and clusters consistently meet optimal health standards in a timely manner.
- Define best practices and standards for infrastructure-as-code and configuration management using Git, Flux, Terraform, and related tools.
- Set technical direction and maintain high-quality system documentation, operational runbooks, and internal tooling that improve the resilience, repeatability, and observability of the infrastructure stack.
- Experience with deploying and supporting Linux / Kubernetes systems at scale.
- Familiarity with infrastructure-as-code and Git-based workflows (e.g., Terraform, Flux, Kustomize).
- Ability to write and maintain basic tooling in common modern languages such as Go and Python.
- Understanding of networking fundamentals (IPAM, VLANs, DHCP, DNS).
- Working knowledge of storage concepts (block vs object, NFS, RAID, etc.).
- Strong sense of ownership and a willingness to work through ambiguity.
- Experience provisioning physical machines in a data center environment.
- Exposure to Talos Linux, Kubernetes bootstrapping, or Kubernetes platform engineering.
- Previous collaboration with facilities, hardware, or network teams in an operational role.
- Humility - Egos are checked at the door
- Collaborative & Team Savvy - We make up the smartest person in the room, together
- Growth & Giver Mindset - Learn it all versus know it all, we share knowledge generously
- Curious & Innovative - Take a creative approach to projects, problems, and design
- Passion, Grit, & Boldness - no limit thinking, fueling informed risk taking
Compensation: At Groq, a competitive base salary is part of our comprehensive compensation package, which includes equity and benefits. For this role, salary range is determined by your location, skills, qualifications, experience and internal benchmarks. Compensation for candidates outside the USA will be dependent on the local market.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resume