| Been working on this with my team at Weco. If you're running autoresearch loops (automated ML research pipelines), you've probably hit the same wall we did — there's no clean way to monitor what's happening across steps, compare runs, or share results with collaborators. We built tooling that gives you W&B-style tracking but designed specifically for autoresearch workflows: step-by-step monitoring, performance analysis, and shareable run dashboards. Demo link here: https://x.com/zhengyaojiang/status/2034687066659791345 Would love feedback from anyone doing autoresearch... What metrics do you actually care about tracking? [link] [comments] |
[P] We built a Weights & Biases for Autoresearch - track steps, compare experiments, and share results
Reddit r/MachineLearning / 3/20/2026
📰 NewsTools & Practical UsageModels & Research
Key Points
- A team at Weco built a new Weights & Biases–style tracking tool tailored for autoresearch workflows to monitor steps, compare runs, and share results with collaborators.
- The tool aims to fill gaps in autoresearch monitoring by offering step-by-step tracking, performance analysis, and shareable dashboards.
- A demo link is provided and feedback is requested on which metrics matter most for autoresearch pipelines.
- The project emphasizes collaborative experimentation in automated ML workflows and invites community input on useful metrics.
Related Articles

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成
日経XTECH

Your AI generated code is "almost right", and that is actually WORSE than it being "wrong".
Dev.to

Lessons from Academic Plagiarism Tools for SaaS Product Development
Dev.to

Windsurf’s New Pricing Explained: Simpler AI Coding or Hidden Trade-Offs?
Dev.to

Building Production RAG Systems with PostgreSQL: Complete Implementation Guide
Dev.to