Staff Software Engineer – Platform, SysEng | Canada | Remote

Remote from
Canada flag
Canada
Salary, yearly, CAD
186,368 - 223,642
Employment type
Full Time,
Job posted
Apply before
10 Jul 2026
Experience level
Senior
Views / Applies
145 / 55

About Grafana Labs

Grafana Labs supports organizations’ monitoring, visualization and observability goals. 750,000+ active installations

Actively Hiring
Verified job posting
This job post has been manually reviewed for authenticity and compliance.

AI Summary

Grafana Labs, the company behind the open-source observability platform, is hiring a Staff Software Engineer for its Platform SysEng squad. This remote role involves managing and maturing the internal engineering platform that supports building and deploying services like Grafana, Mimir, and Loki. The position requires experience with distributed systems, Kubernetes, and cloud infrastructure, and includes on-call responsibilities. The team operates in a fast-paced, collaborative, and autonomous environment focused on scaling and reliability.

Role DNA

Job Complexity
Easy Hard
Pace & Pressure
Relaxed Fast-paced
Autonomy Level
Guided Full Ownership
Communication Load
Independent Highly Collaborative
AI Insight The role requires deep expertise in platform engineering, distributed systems, and performance optimization, making it highly challenging for senior engineers.

Salary Analysis

Median Highly Competitive
CAD205,005
CA Market
CAD150k – 250k
0 CAD275k
AI Insight The offered salary is competitive, with a median of approximately $205,000 CAD, which is above the typical market range for staff software engineers in Canada. This reflects the seniority and impact of the role.

Key Skills

Platform Engineering Distributed Systems Kubernetes Cloud Infrastructure Performance Optimization Reliability Engineering Go Observability DevOps Infrastructure as Code

Dear Hiring Team,

I am writing to express my interest in the Staff Software Engineer - Platform, SysEng position at Grafana Labs. With extensive experience in building scalable platform infrastructure, I am excited about the opportunity to contribute to the maturation of your internal engineering platform. My background in distributed systems and performance optimization aligns well with the challenges of reducing region build timelines.

I am particularly drawn to Grafana's open-source legacy and collaborative remote culture. I look forward to the possibility of joining your team and making a meaningful impact.

Sincerely,
[Your Name]

Can you describe your experience with scaling distributed systems and the challenges you faced?
I have worked on scaling a microservices platform from 10 to 1000 nodes, dealing with issues like data consistency, latency, and fault tolerance. I implemented sharding and caching strategies to improve performance and reliability.
How do you approach designing a platform that supports multiple teams with varying needs?
I focus on building modular, extensible components with clear APIs and documentation. I engage with teams to understand their requirements and iterate on the platform based on feedback, ensuring it remains flexible and easy to use.
Describe a time you had to reduce infrastructure costs while maintaining performance.
I led an initiative to right-size our Kubernetes clusters by analyzing resource utilization and implementing autoscaling. This reduced costs by 30% without impacting application performance.
How do you stay updated with new technologies and decide which ones to adopt?
I follow industry blogs, attend conferences, and participate in open-source communities. I evaluate new technologies based on stability, community support, and alignment with our goals, then prototype before full adoption.
Can you explain your experience with incident response and on-call rotations?
I have been on-call for production systems, responding to alerts and performing root cause analysis. I believe in blameless postmortems and using automation to reduce manual toil, such as implementing runbooks and automated remediation.

Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. Grafana Cloud, our fully managed observability platform, is flexible and built for scale. With Grafana Cloud’s actually useful AI, organizations can see, understand, and act on all their disparate data to move at the speed of their ambitions. Today, more than 35 million users and 7,000+ customers – including Anthropic, Bloomberg, NVIDIA, Microsoft, and Salesforce – trust Grafana Labs to ensure reliability of their applications and systems, resolve incidents quickly, and optimize their telemetry to reduce noise and cost. We are a 100% remote company with 1,600+ team members across 40+ countries, and we’re backed by leading investors including Lightspeed Venture Partners, Sequoia Capital, GIC, Coatue, J.P. Morgan, CapitalG, and Lead Edge Capital. Learn more at grafana.com and follow us on LinkedIn and X.

We’re scaling fast and staying true to what makes us different: an open-source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do.

You may not meet every requirement, and that’s okay. If this role excites you, we’d love you to raise your hand for what could be a truly career-defining opportunity.

This is a remote opportunity and we would be interested in applicants located in Canadian time zones (EST + CST highly preferred at this time). 

Staff Backend Engineer – Platform SysEng

The Opportunity: 

Grafana Cloud moves millions of metrics, log lines, and traces per second from our customers’ environments into a highly available, low-latency stack that processes and stores this data, and serves them to dashboards and alerting tools. We aim to grow this to hundreds of millions per second, and it’s critical that as we grow, we improve our performance, increase our reliability, and, of course, do it efficiently and effectively.

The Internal Engineering Platform (IEP) delivered by the Platform department provides application engineers with the tools, systems and Kubernetes clusters they need to build, deploy and run their workloads. Platform roles at Grafana Labs have an eye for engineers with a passion for performance and reliability, and who enjoy taking projects from conception to production. We organize ourselves into squads to allow focus on: cloud infrastructure and capacity management; security; engineering productivity; monitoring and sustainability; and US Federal compliance.

Because we deploy production services, we have on-call rotations to ensure the health of the system. Everyone at Grafana Labs tries to incorporate and use our product line up into their day-to-day, so being on call is an important way to understand our system and how people use our products.

What You’ll Be Doing:

We are hiring for the Platform SysEng squad. This is an accelerated, cross-cutting squad that is focused on the maturity and scalability of the platform. Currently, SysEng is working across engineering with a goal of reducing new region build timelines to meet customer demands. 

We’re part of a Platform Engineering group that manages infrastructure for the teams that are building some of the most cherished tools – Grafana, Mimir, Loki, Tempo, Pyroscope to name a few.

What Makes You a Great Fit:

  • You enjoy working with engineers, as well as with the management structures that are there to support you and enable you and your team to do your very best.
  • You are comfortable working in a remote-first company; communication is key. For us, working together means being collaborative, friendly, kind, and respectful. We operate by consensus, you can contribute to a discussion but then commit to the team decision.
  • As such, being such a highly distributed company, means we would love someone who is keen on working with distributed systems, too.
  • You are eager to learn and grow. There is a lot of room for growth and development, and the team has quite a lot of knowledge to share for those who are wanting to learn. 
  • You approach development holistically. The team owns the full life cycle of our code; from writing design docs, to looking at developer feedback, and integration testing. We appreciate engineers who enjoy looking at the big picture, and also notice the details of the brush strokes. The Platform team mainly works with Go, Python, and Shell.
  • You have experience with operating your code. Since a lot of operators and developers use our software, having some grounding in both of these spaces really helps us with building better platforms for our users.

We invest heavily in developer productivity. You can use modern AI coding assistants as part of your daily workflow (your choice of tools, within security guidelines), backed by a company-funded usage budget so you can iterate quickly without unnecessary friction. We encourage pragmatic AI-assisted development: faster prototyping, test generation, refactors, documentation, and incident follow-ups—always paired with strong code review and quality standards. You’ll also have access to frontier models (e.g., GPT-Codex 5/3, Claude Opus 4.6, Gemini 3 Pro).

Requirements:

  • Proven delivery of large distributed systems. Experience shipping and operating complex systems that span multiple teams, with clear evidence of technical leadership and impact.
  • Demonstrable experience in system design. Deep understanding of tradeoffs around latency, consistency, availability, scaling and cost.
  • Hands-on cloud and platform experience. Solid experience with cloud-native architectures (microservices, containers/Kubernetes, IaC) and the operational practices that keep them healthy.
  • Reliability and performance ownership. Comfortable defining SLOs/SLIs, doing capacity planning, tuning performance, and driving reliability work end-to-end.
  • Excellent coding and design skills. You write clear, maintainable, well-tested code and can lead technical designs — we use Go, but Python/C/C++/Rust or similar translate well.
  • Comfort with AI-assisted development. We embrace AI and agentic development so we expect you to be curious and comfortable using AI-powered developer tools and ideally have practical experience folding them into a team’s workflow.
  • Influence without authority. Ability to align cross-functional stakeholders, set priorities and drive outcomes in a remote-first environment.
  • Strong communicator. Clear written and verbal communication that works across engineers and non-technical stakeholders.

Bonus Points For:

  • You’ve worked in or on open source, or other community-based projects previously. At Grafana Labs, “OSS is in our DNA”.
  • Familiarity with Kubernetes scheduling and projects like Karpenter.
  • Terraform and/or Crossplane experience. We have mixed usage – each has its strengths.
  • Experience with Tanka and/or Jsonnet.

Compensation & Rewards:

In Canada, the Base compensation range for this role is CAD 186,368 – CAD 223,642. Actual compensation may vary based on level, experience, and skillset as assessed in the interview process. Benefits include equity, bonus (if applicable) and other benefits listed here.

All of our roles include Restricted Stock Units (RSUs), giving every team member ownership in Grafana Labs’ success. We believe in shared outcomes—RSUs help us stay aligned and invested as we scale globally.

*Compensation ranges are country specific. If you are applying for this role from a different location than listed above, your recruiter will discuss your specific market’s defined pay range & benefits at the beginning of the process.

Why You’ll Thrive at Grafana Labs:

  • 100% Remote, Global Culture – As a remote-only company, we bring together talent from around the world, united by a culture of collaboration and shared purpose.
  • Scaling Organization – Tackle meaningful work in a high-growth, ever-evolving environment.
  • Transparent Communication – Expect open decision-making and regular company-wide updates.
  • Innovation-Driven – Autonomy and support to ship great work and try new things.
  • Open Source Roots – Built on community-driven values that shape how we work.
  • Empowered Teams – High trust, low ego culture that values outcomes over optics.
  • Career Growth Pathways – Defined opportunities to grow and develop your career.
  • Approachable Leadership – Transparent execs who are involved, visible, and human.
  • Passionate People – Join a team of smart, supportive folks who care deeply about what they do.
  • In-Person onboarding – We want you to thrive from day 1 with your fellow new ‘Grafanistas’ to learn all about what we do and how we do it. 
  • Balance is Key – We operate a global annual leave policy of 30 days per annum. 3 days of your annual leave entitlement are reserved for Grafana Shutdown Days to allow the team to really disconnect. *We will comply with local legislation where applicable.

Equal Opportunity Employer: We will recruit, train, compensate and promote regardless of race, religion, color, national origin, gender, disability, age, veteran status, and all the other fascinating characteristics that make us different and unique. We believe that equality and diversity builds a strong organization and we’re working hard to make sure that’s the foundation of our organization as we grow.

Grafana Labs may utilize AI tools in its recruitment process to assist in matching information provided in CVs to job postings. The recruitment team will continue to review inbound CVs manually to identify alignment with current openings.

#LI-Remote

For information about how your personal data is used once you’ve applied to a job, check out our privacy policy
 

Apply now >

This job listing has been manually reviewed by the Jobicy Trust & Safety Team for compliance with our posting guidelines, including verification of the company's legitimacy, accuracy of job details, clarity of remote work policy, and absence of misleading or fraudulent content.

How to apply

Did you apply? Let us know, and we’ll help you track your application.

See a few more

Similar Software Engineering remote jobs

Job Search Safety Tips

Here are some tips to help you search and apply for jobs safely:
Watch out for suspicious jobs Don't apply for jobs that offer high pay for little work or offer to hire you without an interview. Read more ›
Check the employer's profile Make sure you're applying for a trustworthy job by visiting the employer's profile and learning more about them. Read more ›
Protect your information Don't share personal details like your bank account or government-issued ID on suspicious websites or messengers. Read more ›
Report jobs that feel unsafe If you see a job that seems misleading, inappropriate or discriminatory, report it for going against our policies and we'll review it.

Share this job

Jobicy+ Subscription

Jobicy

614 professionals pay to access exclusive and experimental features on Jobicy

Free

USD $0/month

For people just getting started

  • • Unlimited applies and searches
  • • Access on web and mobile apps
  • • Weekly job alerts and digest
  • • Access to additional tools like Bookmarks, Applications, and more

Plus

USD $8/month

Everything in Free, and:

  • • Ad-free experience
  • • Daily job alerts and digest
  • • Personal career consultant
  • • AI-powered job advice
Go to account ›