Skip to main content

This job is no longer actively hiring. Talk to Jack to find live roles.

Platform Site Reliability Engineer at AI infrastructure platform startup

Are you a seasoned Platform Site Reliability Engineer passionate about AI infrastructure? Join a pioneering platform startup revolutionizing how software connects with hardware for the AI era. You'll be instrumental in running and evolving a globally scaled platform, deploying Kubernetes for AI workloads, and ensuring 24/7 stability and security. This is a chance to make a significant impact, drive automation, and mentor others in a fast-paced, innovative environment.

Want to apply for this role?

C

This role is no longer actively hiring, but Jack can still help you discover similar open roles that fit.

Location

Gloucestershire, United Kingdom

Compensation

Not Disclosed

Company

Confidential company

See Open Roles

Role overview

We're seeking an experienced Platform Site Reliability Engineer to manage and evolve our AI infrastructure platform. You'll ensure 24/7 stability and security across bare-metal, virtualization, and orchestration layers, deploying and optimizing Kubernetes for AI workloads. This role involves significant automation, incident management, and mentoring, contributing to a scalable and efficient AI ecosystem.

About the company

AI infrastructure platform startup

What you will do

  • Deploy and manage Kubernetes clusters at scale, supporting AI-centric workloads across diverse infrastructure.
  • Optimize Linux system configurations and build automation scripts for platform lifecycle and incident resolution.
  • Apply ITSM frameworks, maintain observability with Prometheus/Grafana, and operate services in 24x7 production environments.

Who this is a fit for

  • 5+ years proven experience in globally scaled, performance-intensive SRE environments with 24/7 support.
  • 3+ years experience running, deploying, and optimizing orchestration platforms, with strong Kubernetes expertise.
  • Expert-level Linux administration (especially Ubuntu), system tuning, and strong networking fundamentals.

Why this role is remarkable

  • Drive the evolution of cutting-edge AI infrastructure, connecting software and hardware for the AI era.
  • Work across bare-metal, virtualization, and large-scale Kubernetes deployments supporting critical AI workloads.
  • Make a significant impact on 24/7 operations, automation, and mentorship within a growing, well-funded tech company.
Thumbnail for Meet Jack

Jack gets to know what you're great at and what you want next, then searches 15 million jobs daily and helps you discover roles at companies like this.

Meet Jack

What happens next?

Jack’s an AI agent for job searching and career coaching. He works for you.

Jill is the AI recruiter working for the company. She recruits from Jack’s network.

If your profile’s a match and Confidential company wants to meet, Jill will make the intro. In the meantime, Jack will send you excellent alternatives.

Learn about Jack

Ready to find your next role?

Talk to Jack for 10 minutes and see your first matches.