Principal Site Reliability Engineer
Job Reference: BBBH 19332
Location: Dublin City
Our client, a global mobile and internet services provider, is looking to hire a seasoned Principal Site Reliability Engineer to join their Cloud Systems Engineering group, responsible for mobile services and apps operations and maintenance within the EU, supporting a growing customer base of over 70 million users.
This is an excellent opportunity to work alongside a very experienced team, leveraging your own expertise while applying SRE site principles to ensure services / applications are running optimally, across public cloud environments while identifying improvements to manage availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning and optimization automation tools.
Your role will involve
- Administering, maintaining and automating systems to ensure reliability, resiliency, scalability and security
- Working on Mobile Services (applications, networks, databases) deployment at different stack layers, addressing challenges related to scalability, reliability, performance and efficiency of systems
- Monitoring and debugging of high severity incidents (applications, networks, databases) applying SRE site principles and forensic root cause analysis techniques and crafting smart resolutions to address.
- Reporting on key service metrics such as availability, capacity, performance and latency, etc across all production systems.
- Identifying opportunities to improve operations and implementing quality-management initiatives and processes based on statistical measurements to drive quality improvement, automating routine tasks, documenting new procedures, etc.
- Setting standards for deployments at scale, infrastructure reliability and scalability.
- On-call support, for high severity incidents, may be required (1 week per 2 months on average or less)
- Minimum of 7+ years of work experience in IT Systems Engineer / Production Support
- 2+ years of work experience as a senior SRE managing a SaaS / PaaS cloud environment at scale for a global internet customer base
- Demonstrated expertise managing a Linux environment with strong scripting expertise (Bash, Python) and databases (e.g. MySQL, SQL Server, Apache Cassandra / NoSQL)
- Experience in network technologies and internet protocols e.g. TCP/IP, Ethernet, UDP, DHCP, DNS, ARP, WAN Routing.
- A seasoned SRE who can leverage their experience in incident management, change management, and problem management
- A systems thinker who manages and communicates effectively driving positive outcomes.
Any experience in the following would also be of benefit:
- Experience configuring and deploying virtual machines (VMs) and cloud infrastructure (AWS, Azure)
- Experience working with automation / orchestration tools e.g. Ansible, Puppet, CloudWatch, Chef, Kubernetes
- Experience managing Apache/Tomcat/Nginx environments
- Bachelor’s Degree in Computer Science / Telecommunications or equivalent
- A very attractive salary with a customisable benefits package
- Superb Dublin City Centre offices & location (when offices reopen).
- Laptop, mobile etc.
What makes this role attractive
Impressive scale of growth with over 70M+ users in Europe
A top rated search engine in Europe with extensive libraries of mobile apps and services
Impressive tech stacks and internal tools, platforms and libraries
Opportunity to shape and influence operations and SRE site principles
Local team, focus on European service provision
Still interested in this opportunity?
Submit your CV (in Word or Txt format as our ATS has a tendency to mangle PDF docs) or perhaps this role does not fully fit your criteria, not to worry – we have numerous roles advertised on our website – www.allenrec.com
Please don’t hesitate to contact any of our team with any questions you may have on Email: or Phone: +353 1 6694040.