Gypscie: A Cross-Platform AI Artifact Management System

arXiv cs.AI / 4/14/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • Gypscieは、AIモデルのライフサイクル(データ収集・準備、学習、評価、デプロイ、監視)で扱う多様なAIアーティファクトを、クロスプラットフォームで統合管理する仕組みとして提案されています。
  • アプリケーション側の複雑さを隠すため、アーティファクトの意味を表す知識グラフと、データやモデルに対する推論を可能にするルールベースのクエリ言語を提供します。
  • モデルライフサイクルの作業は高レベルのデータフローとして表現され、サーバ、クラウド、スーパーコンピュータなど複数の計算基盤へスケジュール可能にします。
  • 製造・更新されたアーティファクトのプロバナンス(来歴)を記録し、説明可能性(explainability)を支援する点が特徴です。
  • 定性的比較と実験評価により、抽象仕様からデータフローを最適化・スケジューリングできることを示しています。

Abstract

Artificial Intelligence (AI) models, encompassing both traditional machine learning (ML) and more advanced approaches such as deep learning and large language models (LLMs), play a central role in modern applications. AI model lifecycle management involves the end-to-end process of managing these models, from data collection and preparation to model building, evaluation, deployment, and continuous monitoring. This process is inherently complex, as it requires the coordination of diverse services that manage AI artifacts such as datasets, dataflows, and models, all orchestrated to operate seamlessly. In this context, it is essential to isolate applications from the complexity of interacting with heterogeneous services, datasets, and AI platforms. In this paper, we introduce Gypscie, a cross-platform AI artifact management system. By providing a unified view of all AI artifacts, the Gypscie platform simplifies the development and deployment of AI applications. This unified view is realized through a knowledge graph that captures application semantics and a rule-based query language that supports reasoning over data and models. Model lifecycle activities are represented as high-level dataflows that can be scheduled across multiple platforms, such as servers, cloud platforms, or supercomputers. Finally, Gypscie records provenance information about the artifacts it produces, thereby enabling explainability. Our qualitative comparison with representative AI systems shows that Gypscie supports a broader range of functionalities across the AI artifact lifecycle. Our experimental evaluation demonstrates that Gypscie can successfully optimize and schedule dataflows on AI platforms from an abstract specification.