Want to professionalize your AI skills, pivot to an AI role and increase your salary?
Master AI Engineering with the most practical and comprehensive LLM Development certifications at Towards AI Academy.

Anthropic

Staff Infrastructure Engineer, Cluster Infrastructure

Anthropic

Published 11 May 2026
London, UK
325K - 485K GBP Annual
Full Time

Share this job

Role Highlights

Languages used

GO
Rust
Python

Key skills

Computer Science
Distributed Systems
Software Systems
Staff Engineer
AI
Infrastructure
Automation
Cluster
Cloud
Security
Research
Inference
Data
Reliability
IAC
IAM
Physics
Biology
Interpretability
Multimodal

Tools, Libraries and Frameworks

Kubernetes
AWS
GCP
Azure
Terraform
Istio
RBAC
HTTP

Description

The role involves owning the technical strategy and roadmap for agent-driven cluster lifecycle management, including provisioning, updates, and decommissioning. The individual will define and drive strategies regarding cluster scalability, homogeneity, and fault tolerance while collaborating with internal research, inference, and product teams. Responsibilities include partnering with cross-functional teams to ensure compute capacity is ingested on time and aligning on physical build-outs. The position requires establishing operational excellence practices such as incident response and postmortem culture. Additionally, the role involves supporting the growth of engineers through technical mentorship and coaching.

Required Qualifications and Skills

Candidates must possess deep expertise in distributed systems, reliability, and cloud platforms, along with strong proficiency in at least one systems language such as Rust, Go, or Python. A proven track record of leading complex, multi-quarter technical initiatives and the ability to build alignment across senior stakeholders are required. The role necessitates a minimum of a bachelor's degree or an equivalent combination of education, training, and experience. Preferred candidates have eight or more years of software engineering experience, including experience operating large-scale compute infrastructure at hyperscale.

Disclaimer

Disclaimer: Job and company description information and some of the data fields may have been generated via GPT-4 summarisation and could contain inaccuracies. The full external job listing link should always be relied on for authoritative information.

About the company

Anthropic

Size

265

Public/Private

Privately Held

Description

Anthropic is an AI safety and research company focused on crafting AI systems that are reliable, interpretable, and steerable, ensuring they remain safe and beneficial for users and society. With an interdisciplinary team experienced in machine learning, physics, policy, business, and product development, Anthropic is dedicated to the mission of beneficial AI. The company emphasizes collaborative big science research, leveraging a unified team approach to focus on large-scale research efforts aimed at advancing long-term goals of creating trustworthy AI systems.

Share

Share this job

Related jobs

Computer Science
AI
Security
Reliability
San Francisco, CA, USA
Full Time
Computer Science
API
UX
AI
New York, NY, USA
Full Time
Data Engineer
Machine Learning
Data Science
Computer Science
San Francisco, CA, USA
Full Time