Felix Drinkall

Oxford PhD · AI Consultant · ex-GB Athlete

University of Oxford

Biography

I am an AI consultant working on applied NLP and forecasting problems for Man Group and ETH. I adapt and deploy generative language models for information discovery in financial markets, spanning RAG, supervised fine-tuning, vLLM-based serving, and multimodal data. I work with companies looking to turn modern LLMs into dependable components of real-world systems, whether that is scoping a problem, prototyping, or taking a model into production.

I am waiting to defend my PhD in Machine Learning at the University of Oxford’s Department of Engineering Science, co-supervised by Janet B. Pierrehumbert and Stefan Zohren. My thesis focuses on incorporating Natural Language Processing into forecasting settings, and has two main themes: first, I explore how to extract and encode text to help improve economic, financial, or epidemiological forecasts; second, I test whether modern LLMs are suitable in this setting. I am interested in understanding how the temporal bias implicit within statically trained LLMs affects predictions. I have probed LLMs for temporal leakage, developed point-in-time training regimes that remove look-ahead bias, and researched methods to scale temporal bias removal to larger models through targeted knowledge editing. I am also a visiting researcher at ETH Zurich under Elliott Ash, where I am developing on my PhD work by training larger point-in-time LLMs.

Before diving full-time into research I rowed for Great Britain, winning Junior and U23 World Championship titles. I take the lessons of discipline and resilience from elite sport into my working life.

Download my CV.

Interests

Natural Language Processing
Text in forecasting
Knowledge Editing
LLM Interpretability

Education

PhD in Natural Language Processing, 2021 - present
University of Oxford
MEng in Engineering Science, 2017 - 2021
University of Oxford

Publications

Felix Drinkall, Janet B. Pierrehumbert, Stefan Zohren

May 2025 ACL Findings 2025

Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups

We demonstrate that semantic shifts in news language can be causally linked to financial market shocks, with partisan differences influencing predictive power. Text signals prove particularly valuable during unexpected events like COVID-19.

Eghbal Rahimikia, Felix Drinkall

January 2025 SSRN (top-10 most downloaded paper in past 12 months)

Re(Visiting) Large Language Models in Finance

This study introduces a novel suite of historical large language models (LLMs) pre-trained specifically for accounting and finance, utilising a diverse set of major textual resources. The models are unique in that they are year-specific, spanning from 2007 to 2023, effectively eliminating look-ahead bias, a limitation present in other LLMs. Empirical analysis reveals that, in trading, these specialised models outperform much larger models, including the state-of-the-art LLaMA 1, 2, and 3, which are approximately 50 times their size. The findings are further validated through a range of robustness checks, confirming the superior performance of these LLMs.

Felix Drinkall, Janet B. Pierrehumbert, Stefan Zohren

January 2025 arxiv pre-print

When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks

Demonstrates that compressing LLM embeddings can improve performance on noisy regression tasks like financial prediction by reducing overfitting, suggesting compression acts as a regularization mechanism.

Felix Drinkall, Janet B. Pierrehumbert, Stefan Zohren

January 2025 FinNLP COLING 2025

Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs

Demonstrates that traditional machine learning methods combining fundamental data with text embeddings outperform current LLMs at credit rating forecasting, highlighting limitations of LLMs for multimodal financial tasks.

Felix Drinkall, Eghbal Rahimikia, Janet B. Pierrehumbert, Stefan Zohren

June 2024 In NAACL Findings

Time Machine GPT

A series of point-in-time language models designed to remain uninformed about future information, addressing temporal bias in LLMs for time-series forecasting applications.

Felix Drinkall, Stefan Zohren, Janet B. Pierrehumbert

June 2022 In NAACL

Forecasting COVID-19 Caseloads Using Unsupervised Embedding Clusters of Social Media Posts

Using transformer-based language models to extract features from Reddit posts for COVID-19 forecasting, outperforming traditional epidemiological data in predicting upward trend signals.

Felix Drinkall

Oxford PhD · AI Consultant · ex-GB Athlete

University of Oxford

Biography

Publications

Contact