Actually put Gemma 4 26B to work on something real: had it extract trading signals from 2,400 earnings calls. One worked. One almost fooled me.

Reddit r/LocalLLaMA / 4/20/2026

💬 OpinionSignals & Early TrendsTools & Practical UsageIndustry & Market MovesModels & Research

Key Points

  • A researcher tested a local “Gemma 4 26B” model by having it read 2,400 earnings-call transcripts (3 years) and extract language features intended to predict stock moves over the following 5 days.
  • After fine-tuning on ~800 labeled examples and running inference in about 14 hours on a single 4090, the study found one statistically meaningful signal: CFO outlook language becoming vaguer (instead of specific guidance numbers) corresponded to ~1.8% underperformance versus the sector.
  • The strongest-looking second signal (“management confidence” in prepared remarks) turned out to be a false “ghost” because it correlated heavily with sector momentum (correlation ~0.85), not novel language-to-price effects.
  • The author concludes that local models can be especially useful in finance for keeping proprietary transcripts on-device, but any discovered patterns must be sanity-checked against known factors to avoid confidently trading on artifacts.
  • Next, the researcher plans to focus the model on the Q&A portion of earnings calls, hypothesizing that off-script management responses may contain more genuine predictive signal.

Everyone posts benchmarks and arena scores. I wanted to see if a local model could do something that makes actual money. So I took my Gemma 4 26B (IQ4_XS quant, running on a single 4090) and gave it a job: read 2,400 earnings call transcripts from the last 3 years and find language patterns that predict how the stock moves in the 5 days after.

Fine-tuned on about 800 labeled transcripts. The labels were simple: did the stock beat or miss its sector over the next week. Model's job wasn't price prediction. It was tagging sentences with forward-looking confidence scores and flagging specific language shifts, like when management switches between precise numbers and vague qualitative stuff.

Inference on all 2,400 took about 14 hours. Not fast but I only need to run this once a quarter so whatever.

Found two things.

Signal A: the real one. When CFOs shift from giving specific guidance numbers to vaguer language in the outlook section ("we feel good about our trajectory" instead of "we expect revenue between X and Y"), stock underperforms its sector by about 1.8% over 5 days. Tested on 600 out-of-sample transcripts. IC of 0.04. Tiny. But statistically significant and basically zero correlation with momentum, value, or any standard factor. That's the part that matters — it's not repackaging something that already exists.

Signal B: the ghost. Model also found what looked like a much stronger pattern. "Management confidence" in the prepared remarks section correlated with outperformance at IC 0.09. Got really excited for about two days. Then I regressed it against sector returns and the correlation was 0.85. Tech CEOs sound confident when tech is ripping. The model wasn't reading language patterns. It was picking up sector momentum through the backdoor of CEO tone.

Killed Signal B immediately. If I hadn't checked it against known factors I'd probably be trading it right now thinking I found some edge.

Takeaway — local models are actually great for this. Running everything locally meant I could throw proprietary transcripts at it without worrying about sending them through someone else's API. That matters a lot in finance. But you absolutely have to sanity check what the model finds against existing factors. It will find ghosts that look extremely convincing.
Next up I'm trying to focus the model specifically on the Q&A section of earnings calls, where management is off script and the language is less rehearsed. I think that's where the real signal lives but haven't proven it yet.

Anyone else using local models for financial text analysis? Curious what setups people are running and whether you've hit similar ghost signal problems.

submitted by /u/CriticalCup6207
[link] [comments]