CBRS: Cognitive Blood Request System with Bilingual Dataset and Dual-Layer Filtering for Multi-Platform Social Streams

arXiv cs.CL / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • The study proposes CBRS, a multi-platform system that automatically filters and parses urgent blood donation requests from high-volume social media streams to reduce missed alerts and response delays.
  • It introduces a new bilingual/multilingual dataset of 11K blood donation messages (Bengali, English, and transliterated Bengali) to reflect real social media linguistic diversity, improving robustness.
  • CBRS uses a cost-efficient dual-layer filtering architecture and adversarial negatives to strengthen request detection performance.
  • The system reportedly achieves 99% accuracy and precision for filtering, and a LoRA fine-tuned Llama-3.2-3B model reaches 92% zero-shot parsing accuracy with a 35× reduction in input token usage.
  • Code, dataset, and trained models are released publicly, enabling further research and deployment for scalable, inclusive information extraction in time-sensitive contexts.

Abstract

Urgent blood donation seeking posts and messages on social media often go unnoticed due to the overwhelming volume of daily communications. Traditional app-based systems, reliant on manual input, struggle to reach users in low-resource settings, delaying critical responses. To address this, we introduce the Cognitive Blood Request System (CBRS), a multi-platform framework that efficiently filters and parses blood donation requests from social media streams using a cost-efficient dual-layered architecture. To do so, we curate a novel dataset of 11K parsed blood donation request messages in Bengali, English, and transliterated Bengali, capturing the linguistic diversity of real social media communications. The inclusion of adversarial negatives further enhances the robustness of our model. CBRS achieves an impressive 99% accuracy and precision in filtering, surpassing benchmark methods. In the parsing task, our LoRA finetuned Llama-3.2-3B model achieves 92% zero-shot accuracy, surpassing the base model by 41.54% and exceeding the few-shot performance of GPT-4o-mini, Gemini-2.0-Flash, and other LLMs, while resulting in a 35X reduction in input token usage. This work lays a robust foundation for scalable, inclusive information extraction in time-sensitive, object-focused tasks. Our code, dataset, and trained models are publicly available at [https://github.com/aaniksahaa/CBRS](https://github.com/aaniksahaa/CBRS).