AMCS Group Logo

AMCS Group

Site Reliability Engineering Technical Lead

Posted 11 Days Ago
Be an Early Applicant
In-Office
Dublin, IRL
Senior level
In-Office
Dublin, IRL
Senior level
Lead SRE/DevOps efforts to ensure reliability, scalability, and security across multi-cloud environments. Define SLIs/SLOs, lead incident response and postmortems, evolve observability (Prometheus/Grafana/OpenTelemetry), drive automation to reduce toil, optimize cost and performance, apply AI/LLM for ops, and provide architectural oversight and mentoring.
The summary above was generated by AI

Sustainability that means business

Who we are:

Sustainability software specialist, AMCS, is headquartered in Ireland, with offices in Europe, the USA, and Australasia. With over 1,300 highly-skilled employees across 22 countries, we specialize in delivering technology solutions to facilitate a carbon neutral future.

What we do:

Our innovative SaaS solutions increase efficiency and boost sustainability in resource-intensive industries. Over 5,000 customers across 23 countries already benefit from our Performance Sustainability software, ensuring we deliver practical solutions for improved profitability and environmental resilience across the globe.

Our people

AMCS offers team members more than just a job, but an opportunity to map out a career with a company that is growing, evolving and setting out new ways of working that are having a positive impact on the world around us. AMCS was established in Ireland and holds onto those local roots and ‘start-up’ mentality with a culture of connection. Connection to our work, our customers, our colleagues and our community that creates a working environment that fosters openness, collaboration and creativity.

Job Description:

We are seeking a highly skilled and motivated DevOps/SRE Tech Lead to join our dynamic engineering team. The ideal candidate will have a deep understanding of cloud technologies, a strong technical background and a passion for driving operational excellence. As a Tech Lead, you will not only mentor and guide our DevOps engineers but also participate in architectural and key decision-making forums regarding our infrastructure and application development processes ensuring a focus is always on the reliability of our systems and centered on positive customer experience. You will collaborate with cross-functional teams to ensure the reliability, scalability, and security of our systems and infrastructure.

Key Responsibilities:

  • Build SLIs, SLOs, and SLAs: Partner with development and business teams to define indicators and objectives that reflect real customer experience

  • Incident Response: Lead through complex incidents and continuously improve how quickly we detect, diagnose, and resolve issues — sharpening alerting, tooling, and on-call practices to shorten MTTD and MTTR over time.

  • Evolve Monitoring and Observability Stack: Consistently improve the observability stack (Prometheus, Grafana, Mimir, Loki, Tempo, OpenTelemetry) with a customer-centric lens leading our operations to be more effective

  • Drive RCAs and Postmortems: Run blameless root cause analyses and postmortems that turn incidents into durable improvements, closing the developer and operations loop

  • High Availability & Performance: Ensure platform availability and responsiveness meet customer expectations. Identify and remove performance bottlenecks before they impact customer

  • AI for Operations: Apply AI/LLM capabilities to incident triage, log/trace analysis, runbook execution, and anomaly detection to shorten MTTR and reduce on-call load.

  • Optimization for Cost: Right-size workloads, eliminate waste, and design for cost-efficient scaling across our cloud platforms (Azure, AWS, GCP) and container infrastructure (Docker, Kubernetes).

  • Toil Reduction: Build automated processes to reduce toil within SRE, such as remediation for known failure modes so the platform heals itself where possible, escalating to humans only when judgement is genuinely required.

  • Architectural Oversight: Participate in architectural design and decision-making processes, ensuring that design choices align with organizational goals and best practices.

What Success Looks Like:

  • High-Signal Alerting: Alerts are accurate and actionable — when something fires, it matters, and the team trusts it. Noise is actively driven down rather than tolerated.

  • Fewer Production Incidents: The number and severity of customer-impacting incidents trend down over time, as recurring failure modes are addressed at the root rather than worked around.

  • Tight Product–SRE Feedback Loop: Continuous, two-way feedback between product engineering and SRE — reliability concerns shape what gets built, and operational learnings flow back into product decisions.

  • Reduced Toil: Engineers spend less time on repetitive operational work and more time on improvements that compound — measured by what gets automated, eliminated, or self-healed away.

Qualifications:

  • Education: Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).

  • Experience:5+ years of experience in DevOps, Site Reliability Engineering (SRE), or related fields, with at least 2 years in a leadership or mentoring role.

  • Cloud Technologies: Deep understanding of cloud providers (Azure, AWS, GCP) and hands-on experience with cloud architecture.

  • Architectural Design: Proven experience in providing architectural oversight, with a strong ability to make informed decisions that drive system performance and scalability.

  • Containerization: Proven experience with container orchestration platforms, particularly Kubernetes.

  • Scripting: Proficiency in scripting languages such as PowerShell, Python or Bash.

  • Monitoring and Logging: Familiarity with monitoring and logging tools like Prometheus, Grafana, and the Grafana stack.

  • Automation Tools: Experience with automation tools such as Ansible, Terraform, or Chef.

  • Soft Skills: Strong leadership qualities, excellent communication skills, and a collaborative mindset.

Preferred Qualifications:

  • Experience with CI/CD pipelines and relevant tools (Azure DevOps, Jenkins, GitLab CI, CircleCI, etc.).

  • Kubernetes certification (CKA, CKAD) and/or cloud certifications (Azure, AWS, GCP) are highly desirable.

  • Knowledge of security best practices and compliance standards in cloud environments.

  • Familiarity with Agile methodologies and project management tools.

#LI-JA1

Similar Jobs

An Hour Ago
Hybrid
Dublin, IRL
Senior level
Senior level
Gaming • Information Technology • Mobile • Software • Esports
The Staff Data Analyst will lead technical standards in BI, design scalable reporting frameworks, mentor team members, and optimize data for business strategy and decision-making.
Top Skills: BigQueryLookerPythonSnowflakeSQLTableau
An Hour Ago
Hybrid
Dublin, IRL
Mid level
Mid level
Cloud • HR Tech • Information Technology • Software
Lead pricing analysis and execution for Workhuman's E-commerce store, developing pricing strategies, conducting tests, and monitoring performance to drive business impact.
Top Skills: PythonRSQL
An Hour Ago
Easy Apply
Hybrid
Dublin, IRL
Easy Apply
Junior
Junior
Consumer Web • eCommerce • Marketing Tech • Retail • Software • Analytics • Generative AI
Responsible for identifying potential clients and collaborating with Account Executives on prospecting strategies, while contributing to the outbound process and maintaining Klaviyo's values of accountability and effort.

What you need to know about the Dublin Tech Scene

From Bono and Oscar Wilde to today's tech leaders, Dublin has always attracted trailblazers, with more than 70,000 people working in the city's expanding digital sector. Continuing its legacy of drawing pioneers, the city is advancing rapidly. Ireland is now ranked as one of the top tech clusters in the region and the number one destination for digital companies, with the highest hiring intention of any region across all sectors.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account