About Astronomer
The Apache Airflow Company
Astronomer empowers data teams to bring mission-critical software, analytics, and AI to life and is the company behind Astro, the industry-leading unified DataOps platform powered by Apache Airflow®. Astro accelerates building reliable data products that unlock insights, unleash AI value, and powers data-driven applications. Trusted by more than 800 of the world’s leading enterprises, Astronomer lets businesses do more with their data. To learn more, visit www.astronomer.io.
The Astronomer Customer Reliability Engineering (CRE) team is responsible for the success of our customers’ usage of our managed Airflow service.
The CREs are responsible for operating, monitoring, and maintaining the platform to ensure availability, predictability, and reliable operations.
As a senior infrastructure specialist within the team, you will focus on the reliability of the underlying cloud infrastructure and Kubernetes clusters. This entails responding to incidents either raised by a customer, or from our monitoring system and then taking further steps to ensure problems are permanently resolved or monitored. As owners of the observability platform, CRE has unlimited potential to improve the reliability of the product and deliver the best possible outcome for our customers.
This role is directly customer-facing and gives exposure to very diverse problems and requirements. CRE get the opportunity to interface with customers from a variety of industries across different cloud providers, and all with different expectations. Your contributions will directly impact customers’ success with using the Astronomer products, and you will be able to help make meaningful improvements to the customer experience.
Provide solutions to customers to make them successful using our products.
Troubleshoot customer environments and engage in active triaging with customers
Participate in on-call rotation for weekend coverage
Provide feedback to the product development teams on customer needs and pain points.
Build out our monitoring and alerting systems.
Build and maintain automation to ensure daily operational tasks are handled as efficiently as possible.
Help direct the architecture of the products and contribute where possible.
Own the customer experience, working directly with customers to prioritize and solve issues, meet SLAs, and provide “white glove” guidance on the path to production.
Participate remotely within a fully distributed team.
Enhance and enrich customer documentation
Work with the latest technology and multi-cloud implementations
6 years of experience, preferably with large, complex cloud infrastructures operating at scale
4 years of experience with Kubernetes
Experience managing a Production distributed system with at least one major cloud provider (one or all: AWS, GCP, Azure)
Strong Linux experience
Knowledge of how to operate and monitor issues for distributed systems
Previous experience in handling customers issues (internal or external)
Strong communication skills
DevOps or CI/CD experience
Python scripting
Good troubleshooting Skills
Experience as a Site Reliability Engineer
Worked with Kubernetes Custom Resources
Depth of knowledge with Azure
Airflow/Big Data Orchestration experience
IaC experience
#LI-Fulltime
#LI-Remote
At Astronomer, we value diversity. We are an equal opportunity employer: we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
Annual salary information is not provided for this position. Explore salary ranges for similar roles in our Salary Directory ›
This job listing has been manually reviewed by the Jobicy Trust & Safety Team for compliance with our posting guidelines, including verification of the company's legitimacy, accuracy of job details, clarity of remote work policy, and absence of misleading or fraudulent content.
For safety tips, see our guides, and please let us know if you need any assistance.

Create a free account with us to save a history of all jobs you've shown interest in.
You can also continue as a guest if you prefer.
Similar DevOps & Infrastructure remote jobs
Jobicy
592 professionals pay to access exclusive and experimental features on Jobicy
Free
USD $0/month
For people just getting started
Plus
USD $8/month
Everything in Free, and: