Want to professionalize your AI skills, pivot to an AI role and increase your salary?
Master AI Engineering with the most practical and comprehensive LLM Development certifications at Towards AI Academy.

Anthropic

Staff Software Engineer, Kubernetes Platform

Anthropic

Published 11 May 2026
San Francisco, CA, USA
320K - 405K USD Annual
Full Time

Share this job

Role Highlights

Languages used

GO
Python
Rust
C++

Key skills

Computer Science
API
Distributed Systems
IAC
AI
Cloud
Research
Machine Learning
Cluster
Operators
Inference
Reliability
Batch
Infrastructure
Physics
Biology
Interpretability
Multimodal

Tools, Libraries and Frameworks

Linux Kernel
Kubernetes
DNS
GPUs
GCP
AWS
GKE
EKS
HTTP

Description

The role involves managing and extending large-scale Kubernetes clusters to support the training and serving of frontier AI models. Responsibilities include scaling the Kubernetes control plane and developing custom scheduling plugins to handle complex, topology-sensitive workloads. The position requires building and maintaining core cluster services to ensure high availability and performance under significant pressure. Additionally, the role entails collaborating with research and infrastructure teams to translate workload requirements into platform capabilities. The work focuses on maintaining system reliability and correctness as the organization's compute footprint expands.

Required Qualifications and Skills

Candidates must possess significant software engineering experience in building and operating production distributed systems. Proficiency in at least one systems-appropriate language such as Go, Python, Rust, or C++ is required, along with deep, hands-on experience with Kubernetes internals. Applicants should demonstrate an ability to debug complex issues across the stack and a track record of designing for system reliability. A bachelor's degree or an equivalent combination of education, training, and experience is the minimum educational requirement.

Disclaimer

Disclaimer: Job and company description information and some of the data fields may have been generated via GPT-4 summarisation and could contain inaccuracies. The full external job listing link should always be relied on for authoritative information.

About the company

Anthropic

Size

265

Public/Private

Privately Held

Description

Anthropic is an AI safety and research company focused on crafting AI systems that are reliable, interpretable, and steerable, ensuring they remain safe and beneficial for users and society. With an interdisciplinary team experienced in machine learning, physics, policy, business, and product development, Anthropic is dedicated to the mission of beneficial AI. The company emphasizes collaborative big science research, leveraging a unified team approach to focus on large-scale research efforts aimed at advancing long-term goals of creating trustworthy AI systems.

Share

Share this job

Related jobs

Computer Science
AI
Security
Reliability
San Francisco, CA, USA
Full Time
Computer Science
API
UX
AI
New York, NY, USA
Full Time
Data Engineer
Machine Learning
Data Science
Computer Science
San Francisco, CA, USA
Full Time