Sr. Observability Engineer - London, N1C 4AG

Universal Music Group
London

Sr. Observability Engineer - London, N1C 4AG, United Kingdom

Job Summary:

We are UMG, the Universal Music Group. We are the world’s leading music company. In everything we do, we are committed to artistry, innovation and entrepreneurship. We own and operate a broad array of businesses engaged in recorded music, music publishing, merchandising, and audiovisual content in more than 60 countries. We identify and develop recording artists and songwriters, and we produce, distribute and promote the most critically acclaimed and commercially successful music to delight and entertain fans around the world.

As a Senior Observability Engineer, you will be a driving force for technical excellence and strategic vision within our global team. You will be instrumental in architecting, building, and leading our comprehensive observability strategy to ensure the reliability, performance, and scalability of our critical IT systems. This senior role demands a passion for data-driven strategy, a commitment to automation, and the ability to mentor and lead. You will not only solve complex technical challenges but also influence the direction of observability practices across UMG globally, ensuring our technology landscape is as world-class as our music.

Job Functions:

  • Architecture & Strategy : Lead the architectural design and strategic roadmap for our observability stack. Drive the vision for world-class monitoring, logging, tracing, and alerting solutions across our hybrid and cloud-native environments.

  • Innovate & Automate : Spearhead the evaluation, selection, and implementation of cutting-edge observability tools and platforms (e.g., Dynatrace, OpenTelemetry, Prometheus, Grafana). Architect and build robust, automated observability pipelines. Take an active part in documenting and defining processes and best practice.

  • Optimize & Analyze : Conduct deep-dive analysis of telemetry data to proactively identify performance bottlenecks, optimize resource utilization, and guide capacity planning.

  • Lead & Mentor : Act as a technical leader and mentor for the observability team and wider engineering groups. Champion and enforce best practices, fostering a culture of proactive and data-informed decision-making.

  • Drive Incident & Problem Management : Working with Operations teams on high-priority incident resolution efforts, utilizing deep analysis of telemetry data for swift root cause identification. Drive post-incident reviews and implement long-term solutions to enhance system resilience.

  • Collaborate & Influence : Partner with Development, SRE, and Infrastructure leaders to embed observability into the entire technology lifecycle. Influence and drive the adoption of observability best practices across the global organization. Champion the use of observability in the global UMG environment.

  • Make UMG the place to be: Mentoring, managing and genuinely leading the Observability team in a way that attracts and retains the best talent. UMG is a place where everyone can bring themselves fully to work and thrive, as a Leader you are a key part of this.

Job Requirements:

Essential Qualifications

  • Experience: 5-7+ years of hands-on experience in an Observability, Site Reliability Engineering (SRE), or DevOps role, with a proven track record of leading complex projects.

  • Technical Leadership: Demonstrated experience in architecting and designing large-scale monitoring and observability solutions.

  • Expert-Level Tooling: Deep expertise with modern observability platforms (e.g., Dynatrace, AWS Cloudwatch, Prometheus, Grafana, ELK Stack, Splunk, OpenTelemetry).

  • Cloud & Infrastructure: Advanced knowledge of major cloud platforms (AWS, Azure, GCP), containerization (Docker, Kubernetes), and Infrastructure as Code (Terraform, Ansible).

  • Programming & Automation: Strong programming and scripting skills (e.g., Python, Go, Shell) with a focus on creating scalable automation and custom tooling.

  • Problem-Solving: Exceptional analytical and strategic problem-solving skills, with the ability to lead through complex technical challenges.

  • Data Analysis: Expertise in analysing and visualising telemetry data into meaningful information to drive actions.

  • Hands-on: Demonstratable hands-on engineering and coding experience, ability to deep-dive into existing and emerging technologies to identify opportunities and solutions.

  • Containerization and Orchestration: Understanding of container technologies (e.g., Docker) and container orchestration platforms (e.g., Kubernetes) to monitor and manage containerized applications.

  • Networking Knowledge: Understanding of networking principles and protocols to effectively monitor and troubleshoot network-related issues.

  • Security Awareness: Awareness of security best practices and the ability to integrate security monitoring into observability processes.

  • Communication & Influence: Excellent communication and interpersonal skills, capable of articulating a technical vision to diverse audiences and influencing senior stakeholders. Ability to collaborate with cross-functional teams, convey findings, and discuss improvements with developers and operations teams.

  • Continuous Learning: Given the dynamic nature of technology, a commitment to continuous learning and staying updated on the latest trends in observability and monitoring.

  • Self-motivated with a high degree of initiative and excellent follow-up skills, along with strong analytical and problem-solving skills.

  • Travel may be required but is not part of the regular work schedule.

  • Bachelor's degree in technology related field as well as 5+ years of relevant experience within the Observability field.

Desired Qualifications

  • Advanced Concepts: Proven experience with Chaos Engineering, AI-driven analytics, defining SLOs/SLIs, and advanced deployment strategies (Canary/Blue-Green).

  • Software Engineering Foundation: Strong background in software engineering principles, database administration, and distributed systems architecture

  • Certifications: Relevant senior-level industry certifications (e.g., AWS Certified DevOps Engineer - Professional, Certified Kubernetes Administrator).

Posted 2026-04-03

Recommended Jobs

Subject Lead in Business Management

Global University Systems (GUS)
London

Inspire future business leaders. Lead with creativity. Deliver real-world impact. At the London College of Contemporary Arts (LCCA), we do things differently. Our students don’t just learn busines…

View Details
Posted 2026-03-19

Front Office Agent

Lancaster Gate Hotel
London

Front Office Agent Job Description We are looking for an experienced and highly motivated Front Office Agent for a 196-rooms hotel located at a vibrant location in London and offering a lot o…

View Details
Posted 2026-05-18

Fire and Security Engineer

4way Recruitment
Isleworth, Greater London

Fire and Security Engineer – Security Biased – Up to £45,000 – Door to Door Travel – No Call Out Rota Location: Aldershot Salary: Up to £45,000 OTE: £52,000 + Industry: Secur…

View Details
Posted 2026-04-21

eDiscovery Project Manager - Northern Ireland

Brimstone Consulting
London

X2 eDiscovery Project Manager jobs Location: Northern Ireland Sector: Legal Sector (international) Managing multiple eDiscovery matters including day-to-day engagement with instructing team …

View Details
Posted 2025-09-01

Linux System Engineers (IT)

CGI
London

CGI was recognised in the Sunday Times Best Places to Work List 2025 and has been named one of the ?World?s Best Employers? by Forbes magazine. We offer a competitive salary, excellent pension, privat…

View Details
Posted 2026-01-16

Compliance Manager - Gas & Electrical

Howells Solutions Limited
Kingston upon Thames, Greater London

Compliance Manager - Social Housing: Gas, Electrical and Repairs South London Up to £65,000 + Benefits We are recruiting for a Compliance Manager to join a leading social housing contractor, to le…

View Details
Posted 2026-05-18

Delivery Manager WCC624052

Shared Services Partnership
Westminster, Greater London

Job Details: Salary range: £42,912 - £49,155 per annum. Salary negotiable depending upon experience. Work location: Westminster City Hall, 64 Victoria Street, Westminster, SW1E 6QP Hours…

View Details
Posted 2026-05-18

Gardener LBS-003

Southwark Council
Southwark, Greater London

Job Category: Manual Labour Location: Unit 1 Sandgate Street, Southwark Council Hours Per Week: 36.00 Start Date: Immediately Start Start Time: 07:30 End Time: 15:30 Salary: £13.…

View Details
Posted 2025-09-10

Graphic Designer

Costello Medical
London

Role Summary Responsibilities : You will be involved in a wide range of creative projects for our clients within the healthcare sector, from initial concept development to full production, allowi…

View Details
Posted 2026-02-24

Periodontist, Relocate to Australia

London

Are you a qualified periodontist looking for a new challenge? Canberra, with its stunning natural beauty, coastal lifestyle and thriving dental industry, is calling you! Are you a skilled periodonti…

View Details
Posted 2025-09-10