LaMSUM: Amplifying Voices Against Harassment through LLM Guided Extractive Summarization of User Incident Reports

arXiv cs.CL / 4/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces LaMSUM, a multi-level framework that uses LLMs to produce extractive summaries (selecting key excerpts) of user incident reports rather than only abstractive paraphrases.
  • It is designed to handle large volumes of citizen-reported sexual harassment data, addressing practical constraints such as LLM context window limits and the need to support various code-mixed languages.
  • LaMSUM combines summarization with multiple voting methods to improve the quality and reliability of the selected excerpts across large collections of reports.
  • The authors evaluate the approach on incident-report summarization using four widely used LLMs (Llama, Mistral, Claude, and GPT-4o) and report that LaMSUM outperforms existing state-of-the-art extractive summarization baselines.
  • The work aims to help relevant stakeholders quickly obtain comprehensive overviews of incidents, supporting better policy development to reduce unwarranted harassment.

Abstract

Citizen reporting platforms help the public and authorities stay informed about sexual harassment incidents. However, the high volume of data shared on these platforms makes reviewing each individual case challenging. Therefore, a summarization algorithm capable of processing and understanding various code-mixed languages is essential. In recent years, Large Language Models (LLMs) have shown exceptional performance in NLP tasks, including summarization. LLMs inherently produce abstractive summaries by paraphrasing the original text, while the generation of extractive summaries - selecting specific subsets from the original text - through LLMs remains largely unexplored. Moreover, LLMs have a limited context window size, restricting the amount of data that can be processed at once. We tackle these challenges by introducing LaMSUM, a novel multi-level framework combining summarization with different voting methods to generate extractive summaries for large collections of incident reports using LLMs. Extensive evaluation using four popular LLMs (Llama, Mistral, Claude and GPT-4o) demonstrates that LaMSUM outperforms state-of-the-art extractive summarization methods. Overall, this work represents one of the first attempts to achieve extractive summarization through LLMs, and is likely to support stakeholders by offering a comprehensive overview and enabling them to develop effective policies to minimize incidents of unwarranted harassment.