LLM-based Atomic Propositions help weak extractors: Evaluation of a Propositioner for triplet extraction

arXiv cs.CL / 4/6/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies whether decomposing complex sentences into atomic propositions—minimal, semantically autonomous information units—can improve knowledge-graph triplet extraction from natural language.
It introduces MPropositionneur-V2, a small multilingual model (6 European languages) built via knowledge distillation from Qwen3-32B into a Qwen3-0.6B architecture.
Experiments across SMiLER, FewRel, DocRED, and CaRB show that atomic propositions particularly help weaker triplet extractors by increasing relation recall and improving overall accuracy in multilingual settings.
When stronger LLM-based extractors are used, the authors propose a fallback combination strategy that recovers entity recall losses while retaining atomic-proposition gains in relation extraction.
Overall, the work positions atomic propositions as an interpretable intermediate representation that complements (rather than replaces) existing extraction systems.

Abstract

Knowledge Graph construction from natural language requires extracting structured triplets from complex, information-dense sentences. In this paper, we investigate if the decomposition of text into atomic propositions (minimal, semantically autonomous units of information) can improve the triplet extraction. We introduce MPropositionneur-V2, a small multilingual model covering six European languages trained by knowledge distillation from Qwen3-32B into a Qwen3-0.6B architecture, and we evaluate its integration into two extraction paradigms: entity-centric (GLiREL) and generative (Qwen3). Experiments on SMiLER, FewRel, DocRED and CaRB show that atomic propositions benefit weaker extractors (GLiREL, CoreNLP, 0.6B models), improving relation recall and, in the multilingual setting, overall accuracy. For stronger LLMs, a fallback combination strategy recovers entity recall losses while preserving the gains in relation extraction. These results show that atomic propositions are an interpretable intermediate data structure that complements extractors without replacing them.

Black Hat Asia

AI Business

How Bash Command Safety Analysis Works in AI Systems

Dev.to

How I Built an AI Agent That Earns USDC While I Sleep — A Complete Guide

Dev.to

How to Get Better Output from AI Tools (Without Burning Time and Tokens)

Dev.to

How I Added LangChain4j Without Letting It Take Over My Spring Boot App

Dev.to

LLM-based Atomic Propositions help weak extractors: Evaluation of a Propositioner for triplet extraction

Key Points

Abstract

Related Articles

Black Hat Asia

How Bash Command Safety Analysis Works in AI Systems

How I Built an AI Agent That Earns USDC While I Sleep — A Complete Guide

How to Get Better Output from AI Tools (Without Burning Time and Tokens)

How I Added LangChain4j Without Letting It Take Over My Spring Boot App

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer