AgentBench SFTチューニングの実験記録 — LoRA rank・epochs・マージ手法の系統的検証

Zenn / 3/11/2026

📰 NewsTools & Practical UsageModels & Research

共有:

Key Points

AgentBenchデータセットを使い、LoRAのrank、学習エポック数、マージ手法の異なる組み合わせについて系統的なSFT（スーパーバイズド・ファインチューニング）実験を実施した。
各パラメータの設定がファインチューニング性能に与える影響を分析し、最適なチューニング条件の探索を行っている。
実験結果はエージェントの行動性能向上に寄与する具体的な知見を提供し、LoRA手法を用いた効率的なファインチューニングの理解を深める内容となっている。

TL;DR AgentBench（ALFWorld + DBBench）向けに 100以上のモデルをSFTで学習し、LoRA rank、エポック数、モデルマージ手法などを系統的に検証した。主な発見： LoRA rank: r=8/12/16/48はすべて有害。r=32のみ有効エポック数: 0.1刻みの違いが致命的。epochs=1.0がピンポイントの最適値 eval_loss: タスク性能と相関しない。最低のeval_lossが最悪のタスク性能を出したモデルマージ: SLERP/DARE-TIESは「良いとこ取り」ではなく「トレードオフの再分配」データ増強: 3,...

Continue reading this article on the original site.

Read original →

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

日経XTECH

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

Dev.to

Why Regex is Not Enough: Building a Deterministic "Sudo" Layer for AI Agents

Dev.to

Perplexity Hub

Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

Dev.to

AgentBench SFTチューニングの実験記録 — LoRA rank・epochs・マージ手法の系統的検証

Key Points

Related Articles

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

Why Regex is Not Enough: Building a Deterministic "Sudo" Layer for AI Agents

Perplexity Hub

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer