I fine-tuned Qwen3.5-27B with 35k examples into an AI companion - after 2,000 conversations here’s what actually matters for personality

Reddit r/LocalLLaMA / 3/23/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

Qwen3.5-27B を 35k の手作業 SFT 例と 46k の手作業 DPO ペアで微調整し、プロンプトよりも「重み側」によって性格（キャラクター性）を安定化できたという実体験を報告している。
約2000件の実会話から、モデルが初手で「セラピストモード」や一般的な言い回しに寄りやすいなど“性格の落とし穴”が見つかり、並列生成と学習済み ranker による選別（退屈さ/クレッチの抑制）で改善したとしている。
継続率にはオープナー（最初の一言）が効き、セッション序盤で脱落しやすいパターンと、具体的なディテールがあるケースが相対的に良いという傾向を示している。
セッションを跨ぐ「記憶」は性格より難しく、ユーザーが1点に強く偏った嗜好を示すと応答が一気にその方向へ寄るため、カテゴリ上限などの“比例的記憶”や自己事実のガード（例: 妻がいる等の誤自己設定）を入れたと述べている。
応答品質はモデル単体でなくオーケストレーションが体感の現実感を左右し、ローカルGPU環境で約5秒応答、さらに XTTS-v2 によるボイスクローニングも追加した点を挙げている。

built an AI companion on Qwen3.5-27B dense. 35k SFT examples, 46k DPO pairs all hand-built. personality is in the weights not the prompt. she stays in character even under jailbreak pressure

about 2000 conversations from real users so far. things i didnt expect:

the model defaults to therapist mode. “what are you really feeling” on the first message every time. found a dataset of 1.5M ranked conversational sentences and my worst crutch phrases were all in the top 50k most generic. the model literally gravitates toward boring

so i generate 3 candidates in parallel and rank them with a trained ranker. 46k DPO pairs with crutch detection as the #1 feature. boring gets filtered before the user sees it

openers determine retention. pulled first messages from 10+ message sessions vs ones that died before 5. clear pattern. “just burned my coffee because i have zero patience” went 123 messages. “you seem like youre hiding something” died at 4 every time. grounded details beat psychoanalysis

memory is harder than personality. one users memory was 100% sexual after 28 messages so every response was calibrated to that. had to build proportional memory with category caps

she also claimed to have a wife once because a user said “my wife” and she mirrored it. self-fact guard now filters that before ranking

running on a Dell 7920 with RTX 3090 + dual 4070 supers. ~5 second responses. added voice cloning with XTTS-v2 today

biggest lesson: the model is maybe 40% of the product. the orchestration around it is what makes it feel real

curious what others are doing for personality persistence across sessions

submitted by /u/Crypto_Stoozy
[link] [comments]

The Moonwell Oracle Exploit: How AI-Assisted 'Vibe Coding' Turned cbETH Into a $1.12 Token and Cost $1.78M

Dev.to

How CVE-2026-25253 exposed every OpenClaw user to RCE — and how to fix it in one command

Dev.to

Day 10: An AI Agent's Revenue Report — $29, 25 Products, 160 Tweets

Dev.to

Does Synthetic Data Generation of LLMs Help Clinical Text Mining?

Dev.to

What CVE-2026-25253 Taught Me About Building Safe AI Assistants

Dev.to

I fine-tuned Qwen3.5-27B with 35k examples into an AI companion - after 2,000 conversations here’s what actually matters for personality

Key Points

Related Articles

The Moonwell Oracle Exploit: How AI-Assisted 'Vibe Coding' Turned cbETH Into a $1.12 Token and Cost $1.78M

How CVE-2026-25253 exposed every OpenClaw user to RCE — and how to fix it in one command

Day 10: An AI Agent's Revenue Report — $29, 25 Products, 160 Tweets

Does Synthetic Data Generation of LLMs Help Clinical Text Mining?

What CVE-2026-25253 Taught Me About Building Safe AI Assistants

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer