AI Navigate

Task Expansion and Cross Refinement for Open-World Conditional Modeling

arXiv cs.LG / 3/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • TEXR introduces Task Expansion and Cross Refinement (TEXR) to expand open-world conditional modeling by generating diverse, uninstantiated dataset schemas and weakly instantiating them with structured probabilistic generators guided by large language models.
  • It performs cross-model refinement by training on disjoint data partitions and revising synthetic values across splits to reduce confirmation bias and improve pseudo-value quality.
  • The refined synthetic datasets are aggregated with real data to train a unified conditional model, boosting zero-, few-, and many-shot performance across heterogeneous tabular benchmarks.
  • Across multiple backbones, TEXR demonstrates consistent improvements, highlighting the value of structured task expansion and cross refinement for open-world conditional modeling.

Abstract

Open-world conditional modeling (OCM), requires a single model to answer arbitrary conditional queries across heterogeneous datasets, where observed variables and targets vary and arise from a vast open-ended task universe. Because any finite collection of real-world datasets covers only a small fraction of this space, we propose Task Expansion and Cross Refinement (TEXR), a semi-supervised framework that enlarges effective task coverage through structured synthesis and refinement of semantic data contexts. TEXR first generates diverse uninstantiated dataset schemas and weakly instantiates them via structured probabilistic generators guided by large language models. It then performs cross-model refinement by training on disjoint data partitions and revising synthetic values across splits to reduce confirmation bias and improve pseudo-value quality. The refined synthetic datasets are aggregated with real data to train a unified conditional model. Across heterogeneous tabular benchmarks, TEXR consistently improves zero-, few-, and many-shot performance for multiple OCM backbones, demonstrating that structured task expansion and cross refinement enhance open-world conditional modeling.