Site Reliability, Mulesoft

Time zone
Full Time
Opening date
Closing date
6 Nov 2021

To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.

About Mulesoft, a Salesforce Company

MuleSoft makes it easy to connect the world’s applications, data and devices. We provide a flexible, unified software platform that enables organizations to easily build application networks using APIs – the digital glue that allows applications to talk to each other and exchange data. MuleSoft is at the heart of the applications and services you use every day, like Netflix, Spotify and Salesforce, from Global 500 corporations to emerging companies in more than 60 countries

The Incident Management team at MuleSoft cares about the holistic health of the MuleSoft platform. We define operational success and best practices. We own the incident management process and we communicate and collaborate with our engineers and customers when we have service problems.

We’re looking for collaborative, diligent people with a passion for infrastructure. In this role, as an Incident Commander, you will work side by side with other MuleSoft engineers to solve technical and workflow challenges as we continue to scale our platform. Strong candidates will be comfortable with both the big picture and the minutiae, and bring both technical understanding and sound judgement to the role.

Who are you?

We’re looking for someone who’s interested in complex distributed systems- how they work, how they can work better, and how we even know if they’re working at all. Since the hard problems in computing are human problems, we’re also looking for someone who’s into improving inter-team collaboration, from a technical and personal point of view.

This is a good role for a generalist with one or more areas of focus or special interest. There are lots of career paths that might lead you here! You could come from a development or operations background, or technical program management. Perhaps even a technical writing background- or others we haven’t thought of.


  • Calm under pressure
  • Ability to communicate clearly and succinctly in both verbal and written formats
  • Ability to triage the status of a situation and direct work activities
  • Interest in distributed systems and familiarity with how the internet and web applications work. You don’t have to have built a datacenter or run a large cloud service, but you do need to be familiar with the OSI model or equivalents and be able to talk about ways to make a system more resilient to failure.
  • Willing to work as part of a distributed (all-remote) team spanning multiple time zones.

What’s this job like?

On the job, you’ll be a trusted advisor on the reliability of MuleSoft production services. Each member of the team has their own particular strengths, and day-to-day contribution could consist of several of the following:

  • Coordinate incident response as an Incident Commander, ensuring that customer-facing issues are managed promptly and professionally.
  • Assisting service teams with incident follow-ups, contributing factor analysis, incident response analysis, advising on remediation plans and customer communications.
  • Connect and collaborate with geographically distributed teams and people of various backgrounds.
  • Make it easy to do the right thing with updates to tools, processes and training across the team and organization.
  • Collaborate with multiple service engineering teams to ensure that production services meet uptime goals, scale with customer demand, and are operable and maintainable over time.
  • Facilitate the operation of distributed systems through design and development of diagnostic, monitoring, alert, and mitigation tools.

Key Responsibilities

  • Communicating the state of the incident and the activities going on to restore service to customers
  • Triage suspected incidents
  • Assessing customer impact from incidents
  • Collect diagnostics and data collection
  • Oversee troubleshooting and remediation plans to restore customer expectations
  • Collecting data to enable problem management

How do I know if I should apply?

If you are interested in or have experience with any of the following topics, you should apply!

  • How humans impact complex system behaviors
  • Design of engineering and collaborative processes
  • Large scale service-oriented infrastructure and the design of scalable, highly available systems
  • Performance characteristics of distributed systems
  • Cloud environments like Amazon Web Services
  • RESTful web services, Linux, Go, Python, Terraform
  • Virtualization and containerization (Xen, LXC, cgroups, Docker, Kubernetes)

For Colorado-based roles: Minimum annual salary of $121,800. You may also be offered a bonus, restricted stock units, and benefits. More details about our company benefits can be found at the following link:



If you require assistance due to a disability applying for open positions please submit a request via this Accommodations Request Form .

Posting Statement

At Salesforce we believe that the business of business is to improve the state of our world. Each of us has a responsibility to drive Equality in our communities and workplaces. We are committed to creating a workforce that reflects society through inclusive programs and initiatives such as equal pay, employee resource groups, inclusive benefits, and more. Learn more about Equality at Salesforce and explore our benefits. and are Equal Employment Opportunity and Affirmative Action Employers. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status. and do not accept unsolicited headhunter and agency resumes. and will not pay any third-party agency or company that does not have a signed agreement with or .

Salesforce welcomes all.

Pursuant to the San Francisco Fair Chance Ordinance and the Los Angeles Fair Chance Initiative for Hiring, Salesforce will consider for employment qualified applicants with arrest and conviction records.

Report · Embed ·

How to apply

ATTN. Be careful! You should never send cash or cheques to a prospective employer, or provide your bank details or any other financial information. We pay great attention to vetting all jobs that appear on our site, but please get in touch if you see any roles asking for such payments or financial details from you. The employer won't know who reported this job.

Share this job

Personalised job alerts

Set up personalised e-mail alerts about similar jobs.

See a few more

Related jobs in DevOps & SysAdmin

Report this job

    The employer won't know who reported this job. Contact your local law enforcement for immediate help if someone is in danger or the victim of a scam.
    All Job Ads are subject to Jobicy's Job Posting Policies. We allow users to flag postings that may be in violation of those terms. Job Ads may also be flagged by Jobicy. However, no moderation system is perfect, and flagging a posting does not ensure that it will be removed.

    Job Widget Code

    Place this code wherever you want the widget to appear on your page.

    <script src="//" async></script>

    Ask a Question

    Position: Site Reliability, Mulesoft.

    Login to Send Message