LearnUpon Logo

LearnUpon

Senior Engineer, Site Reliability

Posted 6 Days Ago
Be an Early Applicant
Easy Apply
In-Office
Dublin, IRL
Senior level
Easy Apply
In-Office
Dublin, IRL
Senior level
As a Senior Site Reliability Engineer at LearnUpon, you'll manage and enhance cloud infrastructure, ensuring high availability, and reliability while automating processes and collaborating with development teams.
The summary above was generated by AI

LearnUpon is looking for a Senior Engineer, Site Reliability to join our team in Dublin. This is a flex role, working 1 day per week from LearnUpon's Dublin office.

LearnUpon LMS helps organizations train their employees, partners, and customers. Businesses can manage, track, and achieve their unique learning goals — all through a single, powerful solution.

With offices in Dublin (our HQ), Belgrade, Philadelphia, Salt Lake City and Sydney, we are a global team with lots of diverse cultures, backgrounds, and experiences that puts our customers' experience at the heart of everything we do. Our culture fosters an open, collaborative and supportive environment where our accomplishments are celebrated and encouraged. We're always striving for the best solution (not the easy one). We’re proud of our success and we’re humble and hungry to achieve more. 

About the team:

You will be part of the SRE Team, which sits within LearnUpon’s Engineering group.  We are a small team focused on developing and supporting our cloud infrastructure and app services, to ensure platform scalability and site uptime. Our flagship product is coded predominantly in Ruby on Rails, with data managed through a common mix of current SaaS back-end technologies including AWS backed services. We also use local containerised development environments. However, we are not bound to our tech stack. We prefer choosing the right technology for the right problem so you’ll have plenty of space to grow your skills. We are key consultants for the entire company on matters of infrastructure feasibility.

What will I be doing?

As a Senior Site Reliability Engineer you will be responsible for the day-to-day operation and management of the LearnUpon platform infrastructure. You will be  designing, implementing and maintaining a highly available and scalable infrastructure with the primary focus on planning ahead for our future as we look to transition towards a containerized environment. 

While our tool stack is predominantly AWS, Terraform and Ansible, we welcome anyone with experience with similar technologies to be part of our journey. We prefer choosing the right technology for the right problem so you’ll have plenty of space to grow your skills.

  • Identifying opportunities to improve and scale our infrastructure for performance, observability, maintainability, and cost, by creating innovative solutions with a strong emphasis on infrastructure as code.
  • Ensure System Reliability and Efficiency: Continuously monitor and improve the reliability, scalability, and performance of our SaaS platform, ensuring high availability and optimal functioning of services in a cloud-based environment.
  • Incident Management and Response: Lead and participate in incident response and post-mortem analysis to effectively manage and resolve production issues, minimise downtime, and implement preventative measures for future incidents.
  • Automation and Tool Development: Develop and maintain automation tools and scripts to streamline operations, reduce manual efforts, and increase system efficiency. Focus on automating routine tasks and deployment processes to enhance system stability.
  • Cross-functional Collaboration: Work closely with development teams to integrate best practices and reliability into the software development lifecycle. Collaborate with product and support teams to understand customer needs and provide technical solutions.
  • Capacity Planning and Resource Optimization: Engage in capacity planning and resource optimization strategies to manage workload demands and resource utilisation, ensuring cost-effective scalability and performance.
  • Mentoring Junior talent.
  • Participate in on-call rota.
What skills do I need?
  • At least five years production system administration/SRE experience.
  • At least three years serving a large-scale SaaS web application solution with AWS, or similar cloud provider.
  • Strong experience with implementing infrastructure as code (e.g. CloudFormation, Terraform etc.), automation tooling (e.g. Puppet, Ansible etc.), CI/CD (e.g. Jenkins, Travis CI, GitLab etc.)
  • Experience in implementing observability tech stacks using tools such as Grafana, Prometheus, Datadog, New Relic etc.
  • You are able to analyse and optimise performance in high-traffic web applications.
  • Ability to solve complex, high-impact problems.
  • Experience building and supporting large-scale distributed systems that back a consumer app or website with associated requirements of performance, security and disaster recovery.
  • Able to effectively communicate technical ideas to and collaborate with both technical and non-technical peers.
  • Experience deploying microservice environments, using containerisation technologies such as Docker and Kubernetes is an advantage.

Don’t worry if you don’t tick every box in order to apply, we’re always happy to review applications and take all experience into consideration. We do our best to provide feedback where we can!

Not required but considered a big plus
  • Certification in AWS, any PaaS, and/or related technologies.
Why work with us?
  • Competitive salary and company ESOP. 
  • 25 days’ annual leave and 1 annual wellness day.
  • Private health insurance and company pension.
  • Parental benefits, including up to 26 weeks’ paid maternity leave, 4 weeks paid paternity leave, and coaching support for new parents.
  • Up to 4 weeks’ per year working abroad (role eligibility applies).
  • Clear career progression opportunities — take LearnUpon where you think it can go.
  • A collaborative and supportive environment with regular team events.

What is the Hiring Process?

Our typical process generally works as follows:

  • Qualified applicants will be invited to schedule a screening call.
  • Successful candidates will then be invited to a series of practical interviews.
  • Finally, candidates will have a short interview with a member of our C-Suite Team.
  • The successful candidate will be contacted with an offer to join our team.

LearnUpon is an Equal Opportunities Employer. 

We do not discriminate on the basis of gender, marital status, family status, age disability, sexual orientation, race, religion, membership of the Traveller community, or any other legally protected status.

By applying for this job, you agree to LearnUpon's Privacy Policy. Find out more about our privacy policy here

Visit our Careers site to find out more about working for LearnUpon, and check us out on Instagram.


Top Skills

Ansible
AWS
Ci/Cd
CloudFormation
Datadog
Docker
Gitlab
Grafana
Jenkins
Kubernetes
New Relic
Prometheus
Puppet
Ruby On Rails
Terraform
Travis Ci
HQ

LearnUpon Dublin, Dublin, IRL Office

1st Floor Ocean House, Arran Quay, Dublin, Dublin, Ireland, D07 DHT3

Similar Jobs

21 Days Ago
Easy Apply
Hybrid
Dublin, IRL
Easy Apply
Senior level
Senior level
Artificial Intelligence • Consumer Web • Edtech • Enterprise Web • HR Tech • Social Impact • Generative AI
As a Senior Staff Site Reliability Engineer at Udemy, you will manage infrastructure, mentor other engineers, and improve tooling while championing best practices in SRE.
Top Skills: AWSGoHelmKotlinKubernetesPythonTerraform
10 Days Ago
In-Office or Remote
Dublin, IRL
Senior level
Senior level
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
Design, operate, and scale production blockchain node infrastructure across multiple clouds. Build and maintain Kubernetes clusters, IaC with Terraform, CI/CD automation, and integrate AI-assisted tooling. Provide 24/7 on-call incident response, partner with security, mentor engineers, and improve reliability for a fast-growing blockchain platform.
Top Skills: Agentic WorkflowsAWSBase)Blockchain Nodes (ArcBlue-Green DeploymentCanary DeploymentCi/CdContainer Image BuildsCursorEthereumGCPGoHelmKubernetesKubernetes ControllersKubernetes OperatorsObservabilityPythonRbacShellSmart ContractsSolanaSQLTerraform
12 Days Ago
Easy Apply
In-Office or Remote
Ireland, IRL
Easy Apply
Senior level
Senior level
Marketing Tech • Cryptocurrency
The Senior Site Reliability Engineer will support high-performance trading infrastructure, enhance security and reliability, and develop automation tools while collaborating with teams.
Top Skills: AWSAzureBashDockerEbpfGoGCPJavaScriptKubernetesLinuxOpentelemetryPrometheusPythonSIEMTypescript

What you need to know about the Dublin Tech Scene

From Bono and Oscar Wilde to today's tech leaders, Dublin has always attracted trailblazers, with more than 70,000 people working in the city's expanding digital sector. Continuing its legacy of drawing pioneers, the city is advancing rapidly. Ireland is now ranked as one of the top tech clusters in the region and the number one destination for digital companies, with the highest hiring intention of any region across all sectors.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account