SyriSign: A Parallel Corpus for Arabic Text to Syrian Arabic Sign Language Translation

arXiv cs.CL / 4/1/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces SyriSign, a newly created parallel dataset for translating Arabic text into Syrian Arabic Sign Language (SyArSL), filling a lack of publicly available resources for this low-resource sign language.
  • SyriSign contains 1,500 video samples covering 150 unique lexical signs, targeting text-to-SyArSL translation and related motion/sign generation tasks.
  • The authors evaluate three deep learning approaches—MotionCLIP, T2M-GPT, and SignCLIP—finding that generative methods can produce strong sign representations.
  • Results also show that the dataset’s limited size restricts generalization, indicating a need for larger-scale data to improve performance.
  • The dataset is planned for public release and is intended to serve as an initial benchmark to support research and accessibility-oriented applications in Syria.

Abstract

Sign language is the primary approach of communication for the Deaf and Hard-of-Hearing (DHH) community. While there are numerous benchmarks for high-resource sign languages, low-resource languages like Arabic remain underrepresented. Currently, there is no publicly available dataset for Syrian Arabic Sign Language (SyArSL). To overcome this gap, we introduce SyriSign, a dataset comprising 1500 video samples across 150 unique lexical signs, designed for text-to-SyArSL translation tasks. This work aims to reduce communication barriers in Syria, as most news are delivered in spoken or written Arabic, which is often inaccessible to the deaf community. We evaluated SyriSign using three deep learning architectures: MotionCLIP for semantic motion generation, T2M-GPT for text-conditioned motion synthesis, and SignCLIP for bilingual embedding alignment. Experimental results indicate that while generative approaches show strong potential for sign representation, the limited dataset size constrains generalization performance. We will release SyriSign publicly, hoping it serves as an initial benchmark.