Stripe Logo

Stripe

Incident Response Manager

Posted 15 Days Ago
Be an Early Applicant
Remote
Mid level
Remote
Mid level
The Incident Response Manager drives responses to incidents, acts as an Incident Commander, and collaborates with teams for incident resolution and communication.
The summary above was generated by AI

Who we areAbout Stripe

Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone’s reach while doing the most important work of your career.

About the team

The Incident Ops team is a global 24/7 team responsible for driving incident response and management from detection to resolution. Stripe is proud of its five 9s reliability and this team is at the forefront of ensuring we keep it that way - working hand-in-hand with Reliability Eng and across the Tech Org. This team of incident response managers (IRM) is defined by our sense of ownership and how we drive incidents to resolution - marshaling the necessary cross-functional resources to respond to and resolve service outages, critical bugs, security attacks and anything that significantly impacts the users of our products. The team is user-first and ensures appropriate external communications from Stripe and senior management to keep our users informed of disruption to their experience of Stripe. The team is skilled in communications, incident handling and technical adeptness as incidents can arise from anywhere and cut across products and orgs in Stripe.

What you’ll do

As an Incident Response Manager (IRM), you’ll play a crucial role in driving the right level of response from Stripes to incidents, determining impact, rallying Stripes to mitigate, communicating to users and ensuring appropriate remediations and orchestrate the Root Cause Analysis (RCA) process. You’ll work closely with IRMs and responding teams globally to ensure solid 24/7 coverage on how we monitor, detect, respond, communicate and mitigate incidents. You’ll focus on developing your skills in incident management, communication, and technical understanding of Stripe’s products and services. When not managing incidents, you'll contribute to improving our operations. You’ll focus on developing your skills in incident management, communication, and technical understanding of Stripe’s products and services.

Responsibilities

  • Act as an Incident Commander for incidents across various classes (reliability, technical, data privacy, product, or security), driving incident resolution with urgency and cross-functional collaboration
  • Lead all user-facing incidents across domains at Stripe - including reliability, technical, security, and data privacy
  • "User First" approach to determine impact, providing accurate situation reports, facilitating comms bridges, and ensuring useful and timely external communications to users
  • Update internal stakeholders and support decision-making processes during incidents
  • Participate in the root cause analysis process, conduct post-mortems for routine incidents, and identify remediations
  • Collaborate with engineering, product, and operations teams to improve incident handling processes and tooling
  • Contribute to team culture and processes that enhance incident response capabilities

Who you are

We’re looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.

Minimum requirements

  • 3+ years of demonstrable major incident experience for organizations that run mission critical applications or always-on Saas environments.
  • Demonstrated ability to independently lead multiple incidents concurrently with minimal support and guidance from senior team members
  • Basic understanding of application development, architectures, and cloud environments
  • Familiarity with infrastructure concepts, including physical, virtual, and container-based compute platforms
  • Practical experience using modern monitoring and telemetry tools such as Splunk Prometheus, and Grafana
  • Basic data analysis skills  using SQL, Splunk or other tools.
  • Strong task management skills, with attention to detail and ability to remain composed in high-pressure situations.
  • Good written and verbal English communication skills, with the ability to translate complex technical issues for various stakeholders.

Preferred qualifications

  • Familiarity with different types of incidents such as technical, privacy, security, or crisis with eagerness to continually learn about Stripe's products and systems.
  • Experience in conveying key details of technical issues to stakeholders
  • Experience with broad public-facing communications (e.g. status pages, tweets) and/or targeted communications (e.g. direct emails, support ticket responses).
  • Familiarity with distributed architectures and system inter-dependencies which operated in a cloud environment.

Top Skills

Cloud Environments
Grafana
Prometheus
Splunk
SQL
HQ

Stripe Dublin, Dublin, IRL Office

Grand Canal Street Lower, Dublin, Dublin, Ireland

Similar Jobs

12 Hours Ago
Remote
Hybrid
2 Locations
Mid level
Mid level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The Analyst I role involves monitoring security alerts, handling incidents, conducting malware analysis, and improving incident response processes while providing high-quality communication to customers.
Top Skills: CC#LinuxmacOSPerlPowershellPythonRuby On Rails,.NetVbWindows
20 Hours Ago
Remote
Hybrid
2 Locations
Mid level
Mid level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The Sr. Analyst will monitor security alerts, manage incident response, perform malware analysis, and mentor lower-level analysts to improve security processes.
Top Skills: CC#LinuxmacOSPerlPowershellPythonRuby On Rails,.NetVbWindows
20 Hours Ago
Remote
Hybrid
2 Locations
Mid level
Mid level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Join CrowdStrike as an Analyst I where you'll conduct security monitoring, incident analysis, and malware remediation while improving incident response processes.
Top Skills: .NetCC#Crowdstrike PlatformCybersecurity Incident Response ToolsLinuxmacOSPerlPowershellPythonRuby On RailsVbWindows

What you need to know about the Dublin Tech Scene

From Bono and Oscar Wilde to today's tech leaders, Dublin has always attracted trailblazers, with more than 70,000 people working in the city's expanding digital sector. Continuing its legacy of drawing pioneers, the city is advancing rapidly. Ireland is now ranked as one of the top tech clusters in the region and the number one destination for digital companies, with the highest hiring intention of any region across all sectors.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account