Staff Infrastructure Engineer, Cluster Infrastructure

Anthropic

Published 11 May 2026

Share this job

San Francisco, CA, USA

320K - 405K USD Annual

Full Time

Share this job

Role Highlights

Languages used

Rust

Python

Key skills

Computer Science

Distributed Systems

Software Systems

Staff Engineer

Infrastructure

Automation

Cluster

Cloud

Security

Research

Inference

Data

Reliability

IAC

IAM

Physics

Biology

Interpretability

Multimodal

Tools, Libraries and Frameworks

Kubernetes

AWS

GCP

Azure

Terraform

Istio

RBAC

HTTP

Description

The role involves owning the technical strategy and roadmap for agent-driven cluster lifecycle management, including provisioning, updates, and decommissioning. The individual will define and drive strategy regarding cluster scalability, homogeneity, and fault tolerance while collaborating with internal research, inference, and product teams. Responsibilities include ensuring new compute capacity is ingested on time and aligning with partner teams on physical build-out and high-bandwidth inter-cluster connectivity. The position requires establishing operational-excellence practices such as incident response and postmortem culture. Additionally, the role involves supporting the growth of engineers through technical mentorship and coaching.

Required Qualifications and Skills

Candidates must possess deep expertise in distributed systems, reliability, and cloud platforms, alongside strong proficiency in at least one systems language such as Rust, Go, or Python. A proven track record of leading complex, multi-quarter technical initiatives and the ability to build alignment across senior stakeholders are required. Preferred candidates have at least eight years of software engineering experience, including time as a technical lead. A bachelor's degree or an equivalent combination of education, training, and experience in a relevant field is required.

Apply

Visit full job listing

Disclaimer

Disclaimer: Job and company description information and some of the data fields may have been generated via GPT-4 summarisation and could contain inaccuracies. The full external job listing link should always be relied on for authoritative information.

About the company

Anthropic

Size

265

Website

anthropic.com

Public/Private

Privately Held

Description

Anthropic is an AI safety and research company focused on crafting AI systems that are reliable, interpretable, and steerable, ensuring they remain safe and beneficial for users and society. With an interdisciplinary team experienced in machine learning, physics, policy, business, and product development, Anthropic is dedicated to the mission of beneficial AI. The company emphasizes collaborative big science research, leveraging a unified team approach to focus on large-scale research efforts aimed at advancing long-term goals of creating trustworthy AI systems.

Share this job

Related jobs

Staff Fiber Network Engineer

Anthropic

Computer Science

Program Management

Data

San Francisco, CA, USA

Full Time

Senior Data Center Capacity Delivery Manager, AUS

Anthropic

Computer Science

SME

Senior

Sydney, Australia

Full Time

[Pipeline] Software Engineer, Safeguards Labs

Anthropic

Computer Science

Trust & Safety

LLMs

New York City, NY

Full Time