Data Scientist (ML, Speech, NLP & Multimodal Expertise) | London

Transperfect Gaming Solutions
London
We are looking to hire a Data Scientist with strong expertise in machine learning, speech and language processing, and multimodal systems. This role is essential to driving our product roadmap forward, particularly in building out our core machine learning systems and developing next-generation speech technologies.

The ideal candidate will be capable of working independently while effectively collaborating with cross-functional teams. In addition to deep technical knowledge, we are looking for someone who is curious, experimental, and communicative.

Key Responsibilities:
• Create maintainable, elegant code and high-quality data products that are modeled, well-documented, and simple to use.
• Build, maintain, and improve the infrastructure to extract, transform, and load data from a variety of sources using SQL, Azure, GCP and AWS technologies.
• Perform statistical analysis of training datasets to identify biases, quality issues, and coverage gaps.
• Implement automated evaluation pipelines that scale across multiple models and tasks.
• Create interactive dashboards and visualization tools for model performance analysis.

Additional Responsibilities:
• Design and implement robust data ingestion pipelines for massive-scale text and speech corpora including automated data preprocessing and cleaning pipelines.
• Create data validation frameworks and monitoring systems for dataset quality.
• Develop sampling strategies for balanced and representative training data.
• Implement comprehensive experiment tracking and hyperparameter optimization frameworks.
• Conduct statistical analysis of training dynamics and convergence patterns.
• Design A/B testing frameworks for comparing different training approaches.
• Create automated model selection pipelines based on multiple evaluation criteria.
• Develop cost-benefit analyses for different training configurations.
• Design comprehensive benchmark suites with statistical significance testing.
• Develop fairness metrics and bias detection systems.
• Build real-time monitoring systems for model performance in production.
• Implement feature drift detection and data quality monitoring.
• Design feedback loops to capture user interactions and model effectiveness.
• Create automated retraining pipelines based on performance degradation signals.

Develop business metrics and ROI analysis for model deployments.

Required Skills, Experience and Qualifications

Programming & Software Engineering
Python (Expert Level) : Advanced proficiency in scientific computing stack (NumPy, Pandas, SciPy, Scikit-learn).
Version Control : Git workflows, collaborative development, and code review processes.
Software Engineering Practices : Testing frameworks, CI/CD pipelines, and production-quality code development.

Machine Learning and Language Model Expertise
Traditional Machine Learning and Deep Learning Knowledge: Proficiency in classical ML algorithms (Naive Bayes, SVM, Random Forest, etc.) and Deep Learning architectures.
Understanding of Transformer Architecture: Attention mechanisms, positional encoding, and scaling laws.
Training Pipeline Knowledge: Data preprocessing for large corpora, tokenization strategies, and distributed training concepts.
Evaluation Frameworks: Experience with standard NLP benchmarks (GLUE, SuperGLUE, etc.) and custom evaluation design.
Fine-tuning Techniques: Understanding of PEFT methods, instruction tuning, and alignment techniques.
Model Deployment: Knowledge of model optimization, quantization, and serving infrastructure for large models.

Collaboration & Adaptability
• Strong communication skills are a must
• Self-reliant but knows when to ask for help
• Comfortable working in an environment where conventional development practices may not always apply:

o PBIs (Product Backlog Items) may not be highly detailed

o Experimentation will be necessary

o Ability to identify what's important in completing a task or partial task and explain/justify their approach

o Can effectively communicate ideas and strategies
• Proactive and takes initiative rather than waiting for PBIs to be assigned when circumstances call for it
• Strong interest in AI and its possibilities, a genuine passion for certain areas can provide that extra spark
• Curious and open to experimenting with technologies or languages outside their comfort zone

Mindset & Work Approach
• Takes ownership when things don't go as planned
• Capable of working from high-level explanations and general guidance on implementations and final outcomes
• Continuous, clear communication is crucial, detailed step-by-step instructions won't always be available
• Self-starter, self-motivated, and proactive in problem-solving
• Enjoys exploring and testing different approaches, even in unfamiliar programming languages

Additional Skills, Experience and Qualifications

Machine Learning & Deep Learning:
Framework Proficiency : Scikit-learn, XGBoost, PyTorch (preferred) or TensorFlow for model implementation and experimentation.
MLOps Expertise : Model versioning, experiment tracking, model monitoring (MLflow, Weights & Biases), data monitoring and validation (Great Expectations, Prometheus, Grafana), and automated ML pipelines (GitHub CI/CD, Jenkins, CircleCI, GitLab etc.).
Statistical Modeling : Hypothesis testing, experimental design, causal inference, and Bayesian statistics.
Model Evaluation : Cross-validation strategies, bias-variance analysis, and performance metric design.
Feature Engineering : Advanced techniques for text, time-series, and multimodal data.

Data Engineering & Infrastructure:
Big Data Technologies: Spark (PySpark), Hadoop ecosystem, and distributed computing frameworks (DDP, TP, FSDP).
Cloud Platforms: AWS (SageMaker, S3, EMR), GCP (Vertex AI, BigQuery), or Azure ML.
Database Systems: NoSQL databases (MongoDB, Elasticsearch), graph databases (Neo4j), and vector databases (Pinecone, Milvus, ChromaDB, FAISS etc.).
Data Pipeline Tools: Airflow, Prefect, or similar orchestration frameworks.
Containerization: Docker, Kubernetes for scalable model deployment
Posted 2025-09-07

Recommended Jobs

Lawyer LBS-011 (Project)

Southwark Council
Southwark, Greater London

Job Category : Legal Location : 160 Tooley Street, Southwark Council Hours Per Week :36.00 Start Date : Immediate Start Start Time :09:00 End Time :17:00 Salary: £38.43 The work en…

View Details
Posted 2025-09-07

Pharmacy Manager - Ashford Middlesex

Middlesex

Tesco UK •  Ashford Middlesex •  Apply by 30-Sep-2025

View Details
Posted 2025-09-07

electrical improvers

CSS
Ealing, Greater London

2 x electrical improvers req in North Acton to start 1st september £18.25ph / 10 hours a day (£182.50 a day) 6 months work data centre project Please call the team for more information...

View Details
Posted 2025-09-05

Audit Manager

TPF Recruitment
City of London, Greater London

We are currently recruiting on behalf of a top 15 accountancy firm in London for a talented Audit Manager to join their dynamic and growing team. This is a fantastic opportunity for an individual who…

View Details
Posted 2025-08-08

Software Engineer

S&P Global
London

About the Role: Grade Level (for internal use): 09 About the Role: S&P Global Ratings is looking for a highly motivated, enthusiastic and skilled senior developer to join Commercial Desktop team with…

View Details
Posted 2025-08-25

Shop Manager

Farmfoods Limited
Ilford, Greater London

Shop Manager – Full Time We are looking to recruit a new shop manager for one of our shops in East London / Essex. We offer a permanent full time position working a variety of shifts, 5 days fro…

View Details
Posted 2025-09-05

Structural Engineer

Aldwych Consulting
City of London, Greater London

Structural Engineer London EC1 £35k-£42k Award winning structural design studio have an opportunity for an exceptionally talented structural engineer to join their dynamic team. You must b…

View Details
Posted 2025-09-02

Senior Site Manager - £100M New Build (UK)

CSR
London

CSR CONSTRUCTION JOBS: Senior Site Manager - £100M New Build CSR Construction is recruiting on behalf of a one of Ireland's leading building Contractors, who specialise in the delivery of Hot…

View Details
Posted 2025-09-05

Controls Engineer

The Management Recruitment Group
London

Job Advertisement Title: Controls Engineer Salary:  £45,700 to £55,240 Location: White City, with travel to South Kensington campus and other sites.  Hybrid work pattern. About the role: …

View Details
Posted 2025-09-03

Underwriting MI Analyst

Bruin Financial
London

Underwriting MI Analyst Location: London (Hybrid) Salary: Up to £65,000 + excellent benefits Industry: Insurance / Lloyd’s Market Job Type: Permanent, Full-Time Are you a data-drive…

View Details
Posted 2025-07-31