Technical Program Manager, Data Centers

Remote from
Europe, Netherlands
Annual salary
Undisclosed
Salary information is not provided for this position. Check our Salary Directory to estimate the average compensation for similar roles.
Employment type
Full Time,
Job posted
Apply before
29 Jul 2026
Experience level
Senior
Views / Applies
50 / 20

About Nebius

Nebius is the AI cloud company, delivering a unified platform that spans the complete AI journey from data and model training and tuning to production runtime and deployment.

Actively Hiring
Verified job posting
This job post has been manually reviewed for authenticity and compliance.

AI Summary

Nebius is seeking a Technical Program Manager to oversee the operational health of its fleet of data centers, including colocation (COLO) and build-to-suit (BTS) sites. This individual contributor role involves managing SLAs, coordinating maintenance, conducting audits, and serving as the primary interface with landlords and operators. The ideal candidate has 10+ years of experience in data center operations or technical program management and is skilled in vendor management and incident coordination. The position is based in the US and requires deep expertise in power, cooling, and connectivity infrastructure. The team operates at a fast pace, ensuring reliable infrastructure for AI cloud workloads.

Role DNA

Job Complexity
Easy Hard
Pace & Pressure
Relaxed Fast-paced
Autonomy Level
Guided Full Ownership
Communication Load
Independent Highly Collaborative
AI Insight Requires 10+ years of experience and management of complex multi-site data center operations, vendor relationships, and SLA enforcement, making it challenging but not at the highest difficulty.

Salary Analysis

Median Highly Competitive
$165,000
US Market
$130k – 200k
0 $220k
AI Insight The salary was not specified in the listing; based on US market data for a senior Technical Program Manager in data centers, the estimated median salary of $165,000 is competitive. This aligns with the required 10+ years of experience and critical nature of the role.

Dear Hiring Manager,

I am writing to express my interest in the Technical Program Manager, Data Centers position at Nebius. With over a decade of experience managing critical infrastructure in colocation and build-to-suit environments, I am confident in my ability to ensure the operational health of your data center fleet. My expertise in SLA enforcement, vendor management, and incident coordination aligns perfectly with the responsibilities outlined in the job description.

In my previous role, I successfully drove site audits, coordinated maintenance windows, and held providers accountable to contractual commitments, reducing downtime by 20%. I thrive in fast-paced environments and am skilled at translating operational data into actionable insights for leadership. I am eager to bring my technical acumen and systematic approach to Nebius’s AI cloud infrastructure team.

Thank you for considering my application. I look forward to the opportunity to discuss how my experience can support Nebius’s growth.

Sincerely,
[Your Name]

Describe a time you managed a critical SLA breach with a colocation provider. What steps did you take to resolve it and prevent recurrence?
In a previous role, we experienced a power outage that exceeded the agreed SLA. I immediately escalated to the provider, documented the incident, and led a root cause analysis. I worked with their team to implement redundant power feeds and monthly testing. I also renegotiated the SLA to include stricter penalties and regular reporting.
How do you prioritize maintenance windows across multiple sites to minimize impact on live workloads?
I assess risk by reviewing workload criticality, redundancy levels, and historical incident data. I collaborate with IT teams to schedule maintenance during low-usage periods and use staggered approaches. I ensure all MOPs are reviewed and approved, and maintain communication with stakeholders about potential impacts.
Can you give an example of how you held a vendor accountable for contractual obligations?
At my last company, a landlord failed to meet cooling capacity targets. I formalized a performance improvement plan, tracked metrics weekly, and escalated to their management. When improvements didn't materialize, I initiated a contractual penalty process and eventually transitioned to a new provider.
What metrics do you use to monitor data center health, and how do you present them to leadership?
I track power usage effectiveness (PUE), temperature/humidity, uptime, SLA compliance, and maintenance completion rates. I create dashboards in tools like Tableau or Power BI, with red/yellow/green indicators. I provide weekly summaries to leadership, highlighting trends and risks.
How would you handle an incident where a critical server rack loses power due to a circuit breaker trip?
First, I would verify backup systems (UPS/generator) are operational and coordinate with the site team to restore power safely. I'd initiate an incident response, communicate with affected internal teams, and begin a root cause analysis. After resolution, I'd work with the provider to implement preventive measures like load balancing and breaker monitoring.

About Nebius:

Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model training through to production deployment, without the cost and complexity of building large in-house AI/ML infrastructure.

Built by engineers, for engineers. From large-scale GPU orchestration to inference optimization, we own the hard problems across compute, storage, networking and applied AI.

Listed on Nasdaq (NBIS) and headquartered in Amsterdam, we have a global footprint with R&D hubs across Europe, the UK, North America and Israel. Our team of 1,500+ includes hundreds of engineers with deep expertise across hardware, software and AI R&D.

The Role

We are looking for a Technical Program Manager to own the operational readiness and ongoing health of our fleet of Data Centers, both COLO and BTS sites. In this role you will be the single point of accountability for ensuring each site runs as expected — SLAs met, maintenance executed on schedule, and audits passed — across a growing portfolio of landlord-operated and purpose-built facilities. You will operate as the primary interface between Nebius and our data center landlords and operators, and you will partner closely with the Nebius IT team to translate site-level operations into reliable infrastructure for our customers.

This is an individual contributor position for someone who is equally comfortable in a contractual SLA review, a maintenance window planning call, and a physical site audit. You will define the mechanisms that keep our sites accountable and surface risk before it becomes an incident.

Key job responsibilities

  • Own the operational health of Nebius COLO and BTS sites, ensuring each facility runs to expectation across power, cooling, space, connectivity, security, and environmental controls.
  • Track, monitor, and enforce SLA compliance across landlords and colocation providers; identify breaches, drive remediation, and hold providers accountable to contractual commitments.
  • Manage and coordinate site maintenance schedules — preventive and corrective — including planning and approving maintenance windows, reviewing Methods of Procedure (MOPs), and minimizing risk to live workloads.
  • Plan and drive site audits covering compliance, capacity, power/cooling performance, physical security, and safety; track findings to closure.
  • Serve as the primary day-to-day interface with data center landlords and operators, managing the operational relationship, escalations, and coordination of on-site activity.
  • Partner closely with the Nebius IT team on deployments, capacity planning, incident response, and change management at each site.
  • Build reporting mechanisms and dashboards that give leadership clear visibility into site health, SLA performance, maintenance status, and open risk across the portfolio.
  • Lead incident coordination and post-incident follow-up, including root cause analysis and corrective action tracking with landlords and internal teams.
  • Track and manage contractual operational obligations, deliverables, and timelines across multiple sites and providers simultaneously.

About the team

The Data Center team is responsible for the physical infrastructure that underpins Nebius’ AI cloud. We manage the full lifecycle of our COLO and BTS footprint — from bringing new capacity online to keeping live sites running reliably at scale. We work at the intersection of facilities operations, vendor management, and IT infrastructure, and we move fast because our customers’ AI workloads depend on the reliability we deliver.

Basic qualifications

  • 10+ years of experience in technical program management, data center operations, or critical facilities/infrastructure management.
  • Experience managing data center infrastructure and operations (power, cooling, space, connectivity) in colocation, build-to-suit, or owned environments.
  • Experience managing third-party vendors, landlords, or service providers against SLAs and contractual obligations.
  • Demonstrated ability to manage multiple programs, sites, or workstreams simultaneously and drive them to measurable outcomes.
  • Bachelor’s degree in a relevant field, or equivalent practical experience.

Preferred qualifications

  • Direct experience with colocation (COLO) and build-to-suit (BTS) data center models, including operating across multiple landlords and operators.
  • Working knowledge of data center SLAs, MOPs/SOPs, maintenance regimes, and audit and compliance frameworks (e.g., Uptime Institute Tier standards, SOC 2, ISO 27001).
  • Experience supporting AI/HPC, GPU cluster, or other high-density compute infrastructure.
  • Strong familiarity with incident management and root cause analysis in a critical facilities context.
  • Experience building reporting mechanisms, dashboards, or operational scorecards for infrastructure health and risk.
  • PMP, Uptime ATD, or equivalent program/operations certification.
  • Proficiency with program and ticketing tools (e.g., Jira, ServiceNow) and comfort working with operational data.
  • Willingness to travel to sites as needed.

Benefits & Perks:

  • Competitive compensation
  • Career growth and learning opportunities
  • Flexibility and ownership
  • Collaborative and innovative culture
  • Opportunity to work on impactful AI projects
  • International environment and talented teams

What’s it like to work at Nebius:

Fast moving – Bold thinking – Constant growth – Meaningful impact – Trust and real ownership – Opportunity to shape the future of AI 

Equal Opportunity Statement:

Nebius is an equal opportunity employer. We are committed to fostering an inclusive and diverse workplace and to providing equal employment opportunities in all aspects of employment. We do not discriminate on the basis of race, color, religion, sex (including pregnancy), national origin, ancestry, age, disability, genetic information, marital status, veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by applicable law.

Applicants must be authorized to work in the country in which they apply and will be required to provide proof of employment eligibility as a condition of hire. 

If you need accommodations during the application process, please let us know.

Apply now >

This job listing has been manually reviewed by the Jobicy Trust & Safety Team for compliance with our posting guidelines, including verification of the company's legitimacy, accuracy of job details, clarity of remote work policy, and absence of misleading or fraudulent content.

How to apply

Did you apply? Let us know, and we’ll help you track your application.

See a few more

Similar Project Management remote jobs

Job Search Safety Tips

Here are some tips to help you search and apply for jobs safely:
Watch out for suspicious jobs Don't apply for jobs that offer high pay for little work or offer to hire you without an interview. Read more ›
Check the employer's profile Make sure you're applying for a trustworthy job by visiting the employer's profile and learning more about them. Read more ›
Protect your information Don't share personal details like your bank account or government-issued ID on suspicious websites or messengers. Read more ›
Report jobs that feel unsafe If you see a job that seems misleading, inappropriate or discriminatory, report it for going against our policies and we'll review it.

Share this job

Jobicy+ Subscription

Jobicy

617 professionals pay to access exclusive and experimental features on Jobicy

Free

USD $0/month

For people just getting started

  • • Unlimited applies and searches
  • • Access on web and mobile apps
  • • Weekly job alerts and digest
  • • Access to additional tools like Bookmarks, Applications, and more

Plus

USD $8/month

Everything in Free, and:

  • • Ad-free experience
  • • Daily job alerts and digest
  • • Personal career consultant
  • • AI-powered job advice
Go to account ›