Highly Efficient and Effective LLMs with Multi-Boolean Architectures

arXiv stat.ML / 4/22/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper addresses the challenge that naive post-training weight binarization for LLMs often causes large performance drops, while training-aware binarization can be complex and inefficient.
It proposes a framework that represents LLMs using multi-kernel Boolean parameters, enabling direct fine-tuning in the Boolean domain without relying on full-precision latent weights.
By removing the need for latent weights, the approach aims to reduce model complexity both during fine-tuning and at inference time.
Experiments across multiple LLMs indicate the method outperforms recent ultra low-bit quantization and binarization techniques.
Overall, the work suggests a new direction for efficiently adapting and deploying LLMs using Boolean-domain training with improved effectiveness.

Abstract

Weight binarization has emerged as a promising strategy to reduce the complexity of large language models (LLMs). Existing approaches fall into post-training binarization, which is simple but causes severe performance loss, and training-aware methods, which depend on full-precision latent weights, adding complexity and limiting efficiency. We propose a novel framework that represents LLMs with multi-kernel Boolean parameters and, for the first time, enables direct finetuning LMMs in the Boolean domain, eliminating the need for latent weights. This enhances representational capacity and dramatically reduces complexity during both finetuning and inference. Extensive experiments across diverse LLMs show our method outperforms recent ultra low-bit quantization and binarization techniques.

Autoencoders and Representation Learning in Vision

Dev.to

Every AI finance app wants your data. I didn’t trust that — so I built my own. Offline.

Dev.to

Control Claude with Just a URL. The Chrome Extension "Send to Claude" Is Incredibly Useful

Dev.to

Google Stitch 2.0: Senior-Level UI in Seconds, But Editing Still Breaks

Dev.to

Now Meta will track what employees do on their computers to train its AI agents

The Verge

Highly Efficient and Effective LLMs with Multi-Boolean Architectures

Key Points

Abstract

Related Articles

Autoencoders and Representation Learning in Vision

Every AI finance app wants your data. I didn’t trust that — so I built my own. Offline.

Control Claude with Just a URL. The Chrome Extension "Send to Claude" Is Incredibly Useful

Google Stitch 2.0: Senior-Level UI in Seconds, But Editing Still Breaks

Now Meta will track what employees do on their computers to train its AI agents

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer