Axiome — AI-Enhanced Systematic Investment

Why LLMs fail as market predictors — and where they add value

Large language models are trained to find consensus in data — to converge toward what is most probable given everything the model has seen. In markets, prices already encode that consensus. An LLM directed at market data will generate conclusions that are already priced in. Signal 1 demonstrates this empirically: given full access to historical financial data and asked to classify stocks by value, the LLM underperforms even the most basic quantitative factor. This is not a model quality problem. It is a structural one. Markets have no ground truth.

Quantitative systematic signals work because they do not predict direction — they identify persistent statistical patterns that have historically preceded returns. Signal 2 is the raw version of this: a simple value factor constructed from fundamental financial data. Signal 3 applies a proprietary machine learning-based signal enhancement process developed through years of institutional systematic investing experience. The improvement from Signal 2 to Signal 3 reflects the depth of that process — not additional data, but better use of the same data.

Signal 4 uses LLMs in a role they are genuinely suited for: contextual validation. Rather than asking the model to predict returns, we ask it to assess whether the real-world context of a specific stock significantly contradicts what the quantitative signal is saying. Where it does, the position is not taken. The LLM never predicts direction. It provides a contextual validity layer that the quantitative model, by construction, cannot provide for itself. The result is a cleaner, more robust signal — and the performance difference is the measurable value of that layer.

The Four Signals

Signal 1

LLM Analyst

A large language model was given complete historical fundamental financial data for each stock up to the rebalance date and asked to classify each stock as high or low value on a scale of 1 to 5. Long positions were taken in stocks scored high value; short positions in stocks scored low value. The LLM had access to the same data as the quantitative signals — no more, no less.

−1.02%

Ann. Return

−0.301

Sharpe

−15.19%

Max Drawdown

131 / 104

Avg Long / Short

Signal 2

Quantitative Value

A standard quantitative value factor constructed from fundamental financial data. Stocks are ranked by earnings yield (earnings per share divided by price) — the most direct measure of how cheaply a stock's earnings can be purchased. Long the cheapest stocks, short the most expensive. No processing or enhancement applied. This is the baseline systematic approach.

+6.62%

Ann. Return

0.903

Sharpe

−11.37%

Max Drawdown

100 / 185

Avg Long / Short

Signal 3

Systematic Signal

Signal 2 enhanced through a proprietary systematic process developed through institutional investing experience. The process improves signal quality without additional data inputs — all enhancement is applied to the same fundamental fields used in Signal 2. The specific methodology is proprietary.

Same input data as all other signals. No additional data sources.

+9.61%

Ann. Return

1.179

Sharpe

−12.30%

Max Drawdown

86 / 161

Avg Long / Short

Signal 4

Systematic + LLM Filter

Signal 3 with a qualitative LLM filter applied at each rebalance. For each position the quantitative signal wants to take, a large language model assesses whether the fundamental context of that stock significantly contradicts the signal direction. Positions where the LLM identifies a significant contradiction are not taken. The LLM does not predict returns — it provides a contextual validity check.

+10.38%

Ann. Return

1.241

Sharpe

−11.78%

Max Drawdown

83 / 160

Avg Long / Short

Methodology

Universe

S&P 500 constituents (fixed as of January 2020)

Backtest period

January 2012 — January 2026

Signal update frequency

Daily (updates on each new SEC filing)

Input data

Fundamental financial statement data only

All signals use

Identical input data — no signal has an informational advantage

Long-short structure

Dollar-neutral: gross long = +1.0, gross short = −1.0

Point-in-time integrity

SEC EDGAR filing dates used as data availability anchors

LLM model

Claude (Sonnet 4.6)

All signals use identical fundamental input data. No signal has an informational advantage over any other. The only variable is methodology.

Data sourced from SEC EDGAR financial statement filings. Filing dates used as point-in-time anchors to ensure no forward-looking information is used in signal construction.

The signals on this page were constructed by an LLM operating within our systematic framework, given constraints on data inputs and complexity. No manual signal engineering. The research you are looking at is itself a demonstration of the approach.

AI cannot predict markets. Here is what it can do.

Why LLMs fail as market predictors — and where they add value

The Four Signals

Methodology

AI cannot predict markets.
Here is what it can do.