Research
We publish our findings. Transparent methodology, open data, reproducible results.
Working Paper
Benchmarking Local LLMs for Financial Tweet Sentiment
Can locally-deployed open-weight LLMs match cloud APIs for financial sentiment analysis? We benchmark 12 Ollama models against VADER and FinBERT on 1,870 real financial tweets from 27 Twitter/X accounts, running entirely on consumer hardware.
Evaluating Qwen3, Gemma3, LLaMA, Mistral, Phi-4, DeepSeek-R1, and 3 finance-tuned models with full ML metrics, statistical significance tests, and confidence calibration.
Pre-print (coming soon)
Code & Data
1,870
Real tweets
14
Models tested
27
Twitter handles
4GB
VRAM required
See how research powers the platform
Our strategies are backed by transparent methodology and reproducible backtests.
Get Started Free