AI Navigate

Using a Human-AI Teaming Approach to Create and Curate Scientific Datasets with the SCILIRE System

arXiv cs.CL / 3/16/2026

📰 NewsTools & Practical UsageModels & Research

Key Points

  • The paper presents SCILIRE, a system for creating datasets from scientific literature using Human-AI teaming to verify and curate data.
  • It enables an iterative workflow where researchers review and correct AI outputs, using corrections as feedback to improve future LLM-based inference.
  • The evaluation combines intrinsic benchmarking and real-world case studies across multiple domains to demonstrate higher extraction fidelity and more efficient dataset creation.
  • The work highlights how human-in-the-loop feedback can continuously improve AI-assisted data extraction for scientific knowledge bases.

Abstract

The rapid growth of scientific literature has made manual extraction of structured knowledge increasingly impractical. To address this challenge, we introduce SCILIRE, a system for creating datasets from scientific literature. SCILIRE has been designed around Human-AI teaming principles centred on workflows for verifying and curating data. It facilitates an iterative workflow in which researchers can review and correct AI outputs. Furthermore, this interaction is used as a feedback signal to improve future LLM-based inference. We evaluate our design using a combination of intrinsic benchmarking outcomes together with real-world case studies across multiple domains. The results demonstrate that SCILIRE improves extraction fidelity and facilitates efficient dataset creation.