Rithum Logo

Rithum

Senior Site Reliability Engineer

Posted 5 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in Ireland
Senior level
Remote
Hiring Remotely in Ireland
Senior level
The Senior Site Reliability Engineer will build and maintain fault-tolerant systems, improve operations, mentor team members, and collaborate to enhance system reliability using automation and AI/ML solutions.
The summary above was generated by AI

Rithum™ is the world’s most trusted commerce network, accelerating how brands, suppliers, and retailers work together to deliver seamless e-commerce experiences. We provide an unmatched platform for brands and retailers, enabling them to accelerate growth, optimise operations across channels, scale product offerings and enhance margins.

Today, more than 40,000 companies trust Rithum to grow their business across hundreds of channels, representing over $50 billion in annual GMV. Using our commerce, marketing, and delivery solutions, our customers create optimised consumer shopping journeys from beginning to end.


Overview

As a Senior Site Reliability Engineer in our Platform Engineering Organization, you help to build and run large-scale, distributed, fault-tolerant systems. In this role, you are involved in the complete lifecycle of our products from inception to operation, ensuring they are reliable, performant and meet appropriate uptime and availability targets. You design and maintain resilient systems, implement robust observability through metrics, logging, and tracing, and build automation that improves deployment, monitoring, and incident response workflows.  This includes leveraging AI/ML for intelligent alerting, anomaly detection, and predictive incident response to enhance system reliability and scalability.  Working with others in the organization, you help develop and influence operational tooling, best practices, and standards that empower the engineering organization and help ensure Rithum's effective and efficient operations. As a Senior Engineer, you operate independently, self-prioritizing work, design and lead projects from start to completion, engaging with stakeholders for successful delivery. You mentor and assist less experienced people on the team and coach them to help improve their skills. 

 

Responsibilities

  • Collaborate with developers, Client Support, and cross-functional teams to build production automation, analysis tools, and improving reliability and performance.
  • Design, implement, and maintain robust application monitoring and observability systems for a distributed, highly available, and scalable software stack leveraging AI/ML to detect anomalies and asset with incidents.
  • Analyse and resolve problems in legacy environments while designing and implementing modern, scalable solutions from the ground up.
  • Participate in the rotating on-call schedule, ensuring that user emergencies, platform alerts, and support requests are addressed.
  • Drives automation and operational efficiency.

Qualifications 

Minimum Qualifications  

  • 3+ years' experience working as an SRE, DevOps Engineer or related
  • Experience with logging and monitoring systems like CloudWatch, Grafana or Prometheus
  • Experience with AWS foundations, including compute, storage, and security
  • Good AWS knowledge including application design, migration support, cost planning, capacity allocation, and application resiliency
  • Expertise in creating multi-region cloud systems with a solid disaster recovery plan
  • Experience with both high-level and scripting languages like Python, Bash or Typescript
  • Experience troubleshooting and debugging complex, distributed applications
  • IaC experience automating infrastructure with CDK, Terraform or Ansible
  • Experience with continuous deployment pipelines and containerization like EKS or ECS
  • Strong understanding of software engineering fundamentals, including object-oriented design, modular architecture, and maintainable coding practices.

Preferred Qualifications 

  • You have a bachelor's degree, or higher, in Computer Science or related field; or equivalent practical experience demonstrating strong software engineering fundamentals.
  • Experience working in a highly collaborative environment with both platform and product teams,
  • Excellent collaboration and communication skills, consistently learning new technologies and helps foster an environment of continuous improvement and innovation.
  • Client satisfaction focus.

Travel Required

Up to 10%


Other Duties

Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities required of the employee for this job. Duties, responsibilities, and activities may change at any time with or without notice.


What it’s like to work at Rithum 

When you join Rithum, you can expect to work with smart risk-takers, courageous collaborators, and curious minds.

As part of the Rithum team, you are valued, supported, and included. Guided by a transparent culture and accessible, approachable leadership, we offer career opportunities aligned to your ambitions and talents. To ensure work and life balance works for you, we also offer an array of resources to support you and your families, including comprehensive benefits and wellness plans.

At Rithum you will:

  • Partner with the leading brands and retailers.
  • Connect with passionate professionals who will help support your goals.
  • Participate in an inclusive, welcoming work atmosphere.
  • Achieve work-life balance through remote-first working conditions, generous time off, and wellness days.
  • Receive industry-competitive compensation and total rewards benefits.

 

Benefits 

  • Medical coverage provided through Irish Life Health; premiums paid by the company 
  • Life & disability insurance
  • Pension plan with 5% company match
  • Competitive time off package with 25 Days of PTO, 11 Company-Paid holidays, 2 Wellness days and 1 Paid Volunteer Day 
  • Access to tools to support your wellbeing such as the Calm App and an Employee Assistance Program 
  • Professional development stipend and learning and development offerings to help you build the skills and connections you need to move forward in your career. 
  • Charitable contribution match per team member  

Rithum is an equal opportunity employer. We are committed to providing an environment of mutual respect where equal employment opportunities are available to all applicants and teammates without regard to race, religion, color, sex, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status or any other protected characteristic. All employment is decided on the basis of qualifications, merit, and business need.

We're committed to providing reasonable accommodations in accordance with the law for qualified applicants. If you require assistance during the interview process due to a medical condition or need support accessing our website or completing the application process, please reach out to us by completing the Accommodations Request Form. Your comfort and accessibility are important to us, and we're here to ensure a seamless experience as you explore opportunities with our team.

Top Skills

Ansible
AWS
Bash
Cdk
Cloudwatch
Ecs
Eks
Grafana
Prometheus
Python
Terraform
Typescript

Similar Jobs

23 Hours Ago
Remote
30 Locations
Senior level
Senior level
Information Technology
As a Senior Site Reliability Engineer, you'll build and maintain infrastructure, tackle operational challenges, and automate processes to enhance reliability.
Top Skills: DockerDocker ComposeGoLinuxPerlPython
7 Days Ago
In-Office or Remote
Dublin, IRL
Senior level
Senior level
Information Technology • Software • Cybersecurity
As a Senior Site Reliability Engineer, you'll manage live services, ensure optimal performance and reliability, and collaborate with teams on deployment and operations.
Top Skills: Amazon Web ServicesBashCi/CdDockerGoogle Cloud PlatformKubernetesMicrosoft Azure Public CloudPowershellSparkSQLTerraform
12 Days Ago
In-Office or Remote
33 Locations
Senior level
Senior level
Artificial Intelligence • Information Technology • Consulting
As a Senior Site Reliability Engineer, you will enhance the reliability and performance of our inference platform, leveraging Kubernetes and Terraform while ensuring smooth scalability of systems under load.
Top Skills: BashGrafanaKubernetesMlopsPrometheusPythonRayTerraformTritonVllm

What you need to know about the Dublin Tech Scene

From Bono and Oscar Wilde to today's tech leaders, Dublin has always attracted trailblazers, with more than 70,000 people working in the city's expanding digital sector. Continuing its legacy of drawing pioneers, the city is advancing rapidly. Ireland is now ranked as one of the top tech clusters in the region and the number one destination for digital companies, with the highest hiring intention of any region across all sectors.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account