Applied Researcher (Monitoring)
- 2+ years of experience conducting empirical research with large language models or AI systems
- Strong experience with AI coding agents. For example, having extensively used and compared frontier coding agents. For example, having designed / developed coding agents
- Experience with LLM-as-a-judge setups
- Experience designing and running experiments, analyzing results, and iterating based on empirical findingse.g. prompting, scaffolding, agent design, fine-tuning, or RL.
- Strong Python programming skills
- Demonstrated ability to work independently on open-ended research problems Bonus:
- Experience with AI evaluation frameworks, in particular Inspect (though other frameworks are relevant as well)
- Familiarity with AI safety concepts, particularly agent-related risks
- Familiarity with computer security, e.g. security testing and secure system design
- Experience fine-tuning language models or working with smaller open-source models
- Previous work building developer tools or monitoring systems
- Publications or contributions to AI safety or ML research
- Experience with production log systems or production log analysis We want to emphasize that people who feel they don't fulfill all of these characteristics but think they would be a good fit for the position nonetheless are strongly encouraged to apply. We believe that excellent candidates can come from a variety of backgrounds and are excited to give you opportunities to shine.
- Build a comprehensive failure mode database : Systematically collect and categorize 100+ distinct AI agent failure modes across safety and security dimensions, creating the foundation for our monitoring library.
- Develop and validate monitoring approaches : Create and empirically test monitoring prompts and strategies for key failure categories, establishing clear metrics for monitor performance and building evaluation frameworks to track progress.
- Optimize the monitoring pipeline : Improve log preprocessing and monitor scaffolding to achieve measurable improvements in detection accuracy, false positive rates, and computational efficiency.
- Advance monitoring capabilities : Begin work on advanced approaches such as fine-tuned specialized monitors or agentic investigation systems, moving our monitoring from reactive detection toward proactive risk identification.
- Hierarchical monitoring for coding agent security : Design a multi-layer monitoring system for detecting security vulnerabilities introduced by coding agents. Start by cataloging common security failure modes (e.g., hardcoded credentials, SQL injection vulnerabilities, insecure API calls). Build specialized monitors for each category, then create a hierarchical system where fast, efficient first-pass monitors flag potentially problematic code for deeper investigation by more sophisticated monitors. Validate the system on synthetic test cases and real agent outputs, iterating to optimize the tradeoff between detection rates and false positives while maintaining sub-second latency for most monitoring decisions.
- Salary: 100k - 180k GBP (~135k - 245k USD)
- Flexible work hours and schedule
- Unlimited vacation
- Unlimited sick leave
- Lunch, dinner, and snacks are provided for all employees on workdays
- Paid work trips, including staff retreats, business trips, and relevant conferences
- A yearly $1,000 (USD) professional development budget
- Start Date: Target of 2-3 months after the first interview
- Time Allocation: Full-time
- Location: The office is in London, and the building is next to the London Initiative for Safe AI (LISA) offices. This is an in-person role. In rare situations, we may consider partially remote arrangements on a case-by-case basis
- Work Visas: We can sponsor UK visas
Recommended Jobs
Corporate Tax Director - Fast-Growth / Tech
Corporate Tax Director - Fintech/Fast Growth - LONDON (Ambitious SM/AD wanting a step up to lead a team will also be considered) This leading Top 10 professional services organisation, are lookin…
Sales Account Manager
We are Manufacturing the Future! Geomiq is revolutionizing traditional manufacturing by providing engineers worldwide with instant access to reliable production methods through our digital platform…
Head of Data Strategy- ERP Transformation
Job description Interim Head of Data Strategy- ERP Transformation Rate- £700/day Outside ir35 contract Duration- 6 months with possible extensions Location- Fully remote Responsibili…
EYFS SEN Teaching Assistant - Primary School in Harrow...
Location: Harrow, North West London Contract: Full-time, Term Time Only Start Date: Next Academic Term Salary: Competitive, based on experience A welcoming and inclusive primary schoo…
Semi Senior Accountant
A dynamic and forward-thinking firm is looking for a Semi Senior Accountant to join a rapidly expanding and friendly team. This is a fantastic opportunity for ambitious candidates looking to further a…
Research Fellow in MRI applied to epilepsy
About Us The post will be based at St Thomas' Hospital in central London in the School of Biomedical Engineering & Imaging Sciences at King's College London: . There is an unmatched infrastructur…
Teacher of Economics - Camden
School Status & Location Sector: Prestigious Independent School, Inner London. Borough: Camden. Start Date: Permanent, full-time role commencing January 2026. The Opportunity & Schoo…
Data Lead (Governance, Privacy, Data Operations) FTC Full Time OR Part-Time
Data Planning Director London 6 Months FTC Highly likely to be converted to permanent HYBRID 3 days in the office / 2 days remote WHO WE ARE We are RAPP world leaders in activating gr…
Year 3 Teacher With Middle Lead - Excellent School in...
Year 3 Teacher with Middle Leadership – Enfield Full-Time | Permanent | January 2026 Start | Competitive Salary (Dependent on Experience) A fantastic opportunity has arisen at an Outstanding pr…
Business Consultant - Commercial
Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in histo…