AI Navigate

Evaluating FrameNet-Based Semantic Modeling for Gender-Based Violence Detection in Clinical Records

arXiv cs.CL / 3/20/2026

💬 OpinionSignals & Early TrendsModels & Research

Key Points

  • FrameNet-based semantic annotation of open-text fields in electronic medical records can improve the identification of GBV patterns compared with purely categorical models.
  • The study assesses three setups: frame-annotated text, frame-annotated text with parameterized data, and parameterized data alone, finding semantic-annotation variants outperform the baseline.
  • The models incorporating semantic annotation achieve over a 0.3 improvement in F1 score, indicating domain-specific semantic representations provide meaningful signals beyond structured data.
  • The results suggest semantic analysis of clinical narratives can enhance early GBV identification and support more informed public health interventions, with implications for data integration in healthcare systems.
  • The work demonstrates the value of linguistic-semantic approaches in clinical NLP for public health surveillance.

Abstract

Gender-based violence (GBV) is a major public health issue, with the World Health Organization estimating that one in three women experiences physical or sexual violence by an intimate partner during her lifetime. In Brazil, although healthcare professionals are legally required to report such cases, underreporting remains significant due to difficulties in identifying abuse and limited integration between public information systems. This study investigates whether FrameNet-based semantic annotation of open-text fields in electronic medical records can support the identification of patterns of GBV. We compare the performance of an SVM classifier for GBV cases trained on (1) frame-annotated text, (2) annotated text combined with parameterized data, and (3) parameterized data alone. Quantitative and qualitative analyses show that models incorporating semantic annotation outperform categorical models, achieving over 0.3 improvement in F1 score and demonstrating that domain-specific semantic representations provide meaningful signals beyond structured demographic data. The findings support the hypothesis that semantic analysis of clinical narratives can enhance early identification strategies and support more informed public health interventions.