Automated Malware Family Classification using Weighted Hierarchical Ensembles of Large Language Models
arXiv cs.AI / 4/6/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper tackles malware family classification in open-world conditions where obfuscation and packing make traditional supervised ML approaches reliant on labeled data and handcrafted features less scalable.
- It proposes a zero-label framework that uses a weighted hierarchical ensemble of pretrained LLMs, combining multiple models’ decision-level outputs instead of training or feature learning.
- The ensemble weights each LLM’s contribution using empirically derived macro-F1 scores and applies a hierarchical strategy that first determines coarse malicious behavior and then refines to fine-grained malware families.
- The authors argue the hierarchical aggregation improves robustness and reduces instability from any single model while better matching analyst-style reasoning.
Related Articles

How Bash Command Safety Analysis Works in AI Systems
Dev.to

How to Get Better Output from AI Tools (Without Burning Time and Tokens)
Dev.to

How I Added LangChain4j Without Letting It Take Over My Spring Boot App
Dev.to

The Future of Artificial Intelligence in Everyday Life
Dev.to

Teaching Your AI to Read: Automating Document Triage for Investigators
Dev.to