TIEG-Youpu Solution for NeurIPS 2022 WikiKG90Mv2-LSC

arXiv cs.CL / 3/31/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • 本稿は、NeurIPS 2022で用いられた大規模百科事典型知識グラフ「WikiKG90Mv2」(90百万超のエンティティ)を対象に、知識グラフ埋め込みのための新しい手法を提案しています。
  • 取得(retrieval)→再ランキング(re-rank)のパイプラインを採用し、構造的・意味的に類似した候補を得るための「priority infilling retrieval model」を導入しています。
  • 再ランキング段階では、近傍情報を強化した表現を用いるアンサンブル型再ランキングモデルで最終的なリンク予測を行います。
  • 実験では提案手法が既存ベースラインを上回り、バリデーションセットのMRRを0.2342から0.2839へ改善したと報告しています。

Abstract

WikiKG90Mv2 in NeurIPS 2022 is a large encyclopedic knowledge graph. Embedding knowledge graphs into continuous vector spaces is important for many practical applications, such as knowledge acquisition, question answering, and recommendation systems. Compared to existing knowledge graphs, WikiKG90Mv2 is a large scale knowledge graph, which is composed of more than 90 millions of entities. Both efficiency and accuracy should be considered when building graph embedding models for knowledge graph at scale. To this end, we follow the retrieve then re-rank pipeline, and make novel modifications in both retrieval and re-ranking stage. Specifically, we propose a priority infilling retrieval model to obtain candidates that are structurally and semantically similar. Then we propose an ensemble based re-ranking model with neighbor enhanced representations to produce final link prediction results among retrieved candidates. Experimental results show that our proposed method outperforms existing baseline methods and improves MRR of validation set from 0.2342 to 0.2839.