[P] Finetuned small LMs to VLM adapters locally and wrote a short article about it

Reddit r/MachineLearning / 3/20/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

The author finetuned a standard 135M parameter text language model to add vision capabilities using vision-language model adapters, demonstrating a practical small-model approach.
The Towards Data Science article documents each stage, including how Q-Formers work and how adapters between LMs and VLMs are trained, along with datasets used.
The GitHub repository for the project has been open-sourced, enabling others to reproduce or extend the workflow.
The post serves as a learning resource for others pursuing similar VLM-from-scratch projects by sharing notes and lessons learned.

Recently I worked on a VLM training project that took a standard 135M param text language model, and gave it vision capabilities. Wrote an article on Towards Data Science covering each stage of that project, what I learned, etc.

Article contains all my notes about how Q-Formers work, adapters between LM and VLMs are trained, datasets etc. Git repo also open sourced.

Sharing in case someone does a similar project and find it useful as a learning resource.

https://towardsdatascience.com/how-vision-language-models-are-trained-from-scratch/

submitted by /u/AvvYaa
[link] [comments]

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

日経XTECH

Run Claude Opus 4.6 via OpenAI-compatible API using your existing Pro/Max subscription

Dev.to

Jupyter AI Extension - Multi-LLM Support

Dev.to

How to Build an AI Team: The Solopreneur Playbook

Dev.to

Getting Started with AI Agents

Dev.to

[P] Finetuned small LMs to VLM adapters locally and wrote a short article about it

Key Points

Related Articles

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成

Run Claude Opus 4.6 via OpenAI-compatible API using your existing Pro/Max subscription

Jupyter AI Extension - Multi-LLM Support

How to Build an AI Team: The Solopreneur Playbook

Getting Started with AI Agents

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer