Local LLM evaluation advice after DPO on a psychotherapy dataset

Reddit r/LocalLLaMA / 3/28/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • A Reddit user describes fine-tuning Gemma 3 4B with DPO on a psychotherapy dataset to build a local companion-style chatbot, explicitly stating it is not medical advice or therapy.

I fine-tuned Gemma 3 4B on a psychotherapy dataset using DPO as part of an experiment to make a local chatbot that can act as a companion (yes, this is absolutely not intendended to give medical advice or be a therapist).

I must thank whoever invented QLoRa and PeFT - I was able to run the finetuning on my RTX 3050Ti laptop. It was slow, and the laptop ran hot - but it worked in the end :D

What testbenches can I run locally on my RTX 3050Ti 4GB to evaluate the improvement (or lack thereof) of my finetuned model vis-a-vis the "stock" Gemma 3 model?

submitted by /u/i5_8300h
[link] [comments]