Resend is building the most accessible email platform for developers. As we’ve grown to over 15K customers and continue to onboard thousands of new users every day, the challenge of maintaining a reliable, scalable, and observable platform has grown with it.
You’ll design and operate the systems that keep Resend fast, reliable, and self-healing. From monitoring pipelines to automation, you’ll help build the foundation that allows every engineer to move confidently and safely.
In this role you will...Evolve and shape our on-call processes — from detection to resolution
Build automation for recovery, scaling, and self-healing systems
Improve observability across the stack: logs, metrics, traces, and dashboards
Define and track SLOs for core systems like email delivery, API latency, and queue performance
Collaborate closely with engineering teams to design for reliability, not just react to incidents
Codify playbooks, postmortems, and reliability standards
Work with infrastructure spanning AWS, queues, databases, and workers
Bring 5+ years of experience in Site Reliability, Platform, or Infrastructure Engineering
Build and enhance backend services that drive user-facing features
Have deep experience with Node.js and TypeScript (Express, Hono, Next.js)
Infrastructure and reliability skills (Datadog, AWS, Terraform, CDK)
Are fluent in writing and speaking English
Have strong experience with observability and monitoring tools (Datadog, Grafana, OpenTelemetry)
Understand distributed systems: queues, workers, caching, databases, networking
Know how to design systems with safety and fail-safe operations in mind
Are comfortable working across the stack — from load balancers to delivery pipelines
Care deeply about incident management, postmortems, and continuous improvement
Autonomy to "just ship it"
100% remote team with flexible working schedules
Modern tech stack
Honest and low-ego team
Ownership of problems and solutions
We are building the modern email sending platform for developers. We care deeply about quality, creating for everyone and building in the open. We started with an open source project in 2022. Now, we onboard nearly 100 paying customers every day and foster a growing developer community.
Our fully remote team of 25 humans spans 7 countries... and counting. We’re backed by a16z, Y Combinator, Basecase, and other top investors.
Read more about how we work, how we hire, and what we value here.

.png)

