The Production Support Engineering team plays a key role at Amount by ensuring production issues are managed efficiently and effectively. You will manage high-priority issues to resolution following industry best practices. Youâll troubleshoot, fix, and apply workarounds to resolve technical issues across multiple platforms. Â Each day, youâll interact with every aspect of our organization to find the best solution for our partner. Â Management of ticket queues, monitoring for issues and post-release validation are also a large part of this role, all while meeting our partnerâs SLA requirements.
Team: This role interacts with nearly every group within the organization, including engineering, product, QA, customer success and others.
Salary:Â $63,000-$73,000 base salary
Bonus & Equity: Amount employees are eligible for annual performance bonuses and equity grants as part of our commitment to shared success!
Similar job titles: Production Support, Production Support Analyst, Incident Manager, Incident Coordinator, IT Major Incident Manager, Application Support Engineer, Support Engineer
WHAT WEâLL TRUST YOU TO DELIVER:
- Technical ability to deep dive into issues by querying tables, analyzing data and problem-solving
- Prioritization and triage of incoming requests/issues
- Drive incident resolution and lead conversations with cross-functional groups.  Ask the right questions to help determine impact/priority and the correct route for resolution. Oversee a technical bridge, if required.
- Management of all incidents through the incident management lifecycle
- Documentation of all relevant events, getting status reports while driving decision-making and resolution
- Ensure stakeholders are updated according to predefined service level agreements
- Completion and ownership of the postmortem with appropriate root cause analysis performed
- Improvement suggestions to capture preventative measures that will avoid recurrences of incidents
- Investigate patterns that indicate larger overall issues, even if we donât have the solution.
- Compilation of metrics on a weekly and monthly basis. Â Maintain dashboards for service incidents and ad hoc reporting as requested
- Play an active role during critical incidents which may occur outside of normal business hours. Â Nights, weekends, and holidays on an on-call rotation basis is a must
- Creation of runbooks or standard operating procedures (SOP) so we can all learn from each other and add to our knowledge base
WHAT YOU LIKELY BRING TO THE TABLE:Â
- Technical and/or engineering background, ideally with experience writing SQL queries
- Experience working with development teams in a fast-paced environment
- Basic knowledge or interest of any programming language such as Java, Python or Ruby
- 2 years of experience coordinating and executing major incidents, with demonstrated capacity to lead under pressure
- Previously collaborated with a wide spectrum of internal and external stakeholders
- Worked in an organization with a complex business environment
- Leadership skills with the ability to make quick decisions
- Familiar with ITSM/ITIL concepts
- You thrive being a self-starter, who can lead others during stressful situations
- Familiar with tools such as Confluence, Jira, and on-call management software such as PagerDuty and experience with error monitoring software (Sentry, Kibana)
ABOUT AMOUNT (TL;DR)
Founded:Â 2020
Employees: 150+
Locations: Chicago (HQ) and US Remote
Funding: Amount has raised $281M in total equity capital since inception, including most recently at a valuation of $1B. Investors include WestCap, Hanaco Ventures, Goldman Sachs, Invus Opportunities, Mastercard, and PSCU