Adapting Methods for Domain-Specific Japanese Small LMs: Scale, Architecture, and Quantization

arXiv cs.LG / 3/20/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

The paper presents a systematic methodology for building domain-specific Japanese small language models using QLoRA fine-tuning, addressing training scale, base-model selection, and architecture-aware quantization.
Stage 1 demonstrates optimal training scale around 4,000 samples with test-set NLL minimized at 1.127 and overfitting observed at 5,000 samples.
Stage 2 shows that Llama-3 models with Japanese continual pre-training (Swallow-8B, ELYZA-JP-8B) outperform multilingual models such as Qwen2.5-7B.
Stage 3 reveals architecture-aware quantization results where Llama-3 architectures improve under Q4_K_M quantization while GQA architectures degrade; production recommendation Swallow-8B Q4_K_M achieves 2.830/3 score, 8.9 s/question, and 4.9 GB size, with applicability to compact Japanese specialist LMs on consumer hardware.

Abstract

This paper presents a systematic methodology for building domain-specific Japanese small language models using QLoRA fine-tuning. We address three core questions: optimal training scale, base-model selection, and architecture-aware quantization. Stage 1 (Training scale): Scale-learning experiments (1k--5k samples) identify n=4,000 as optimal, where test-set NLL reaches minimum (1.127) before overfitting at 5k samples. Stage 2 (Compare finetuned SLMs): Comparing four Japanese LLMs shows that Llama-3 models with Japanese continual pre-training (Swallow-8B, ELYZA-JP-8B) outperform multilingual models (Qwen2.5-7B). Stage 3 (Quantization): Llama-3 architectures improve under Q4_K_M quantization, while GQA architectures degrade severely (Qwen2.5: -0.280 points). Production recommendation: Swallow-8B Q4_K_M achieves 2.830/3 score, 8.9 s/question, 4.9 GB size. The methodology generalizes to low-resource technical domains and provides actionable guidance for compact Japanese specialist LMs on consumer hardware.

We asked 200 ChatGPT users their biggest frustration. All top 5 answers are problems ChatGPT Toolbox solves.

Reddit r/artificial

I Built an AI That Reviews Every PR for Security Bugs — Here's How (2026)

Dev.to

[R] Combining Identity Anchors + Permission Hierarchies achieves 100% refusal in abliterated LLMs — system prompt only, no fine-tuning

Reddit r/MachineLearning

How I Built an AI SDR Agent That Finds Leads and Writes Personalized Cold Emails

Dev.to

Complete Guide: How To Make Money With Ai

Dev.to

Adapting Methods for Domain-Specific Japanese Small LMs: Scale, Architecture, and Quantization

Key Points

Abstract

Related Articles

We asked 200 ChatGPT users their biggest frustration. All top 5 answers are problems ChatGPT Toolbox solves.

I Built an AI That Reviews Every PR for Security Bugs — Here's How (2026)

[R] Combining Identity Anchors + Permission Hierarchies achieves 100% refusal in abliterated LLMs — system prompt only, no fine-tuning

How I Built an AI SDR Agent That Finds Leads and Writes Personalized Cold Emails

Complete Guide: How To Make Money With Ai

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer