MiNER: A Two-Stage Pipeline for Metadata Extraction from Municipal Meeting Minutes
arXiv cs.CL / 3/30/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces MiNER, a two-stage pipeline to extract key metadata (meeting number, date, location, participants, and time ranges) from heterogeneous municipal meeting minutes where information is often unstandardized.
- In stage one, a transformer-based question answering model locates the opening and closing text spans containing metadata, followed by an entity extraction stage using BERTimbau and XLM-RoBERTa variants with optional CRF layers.
- The entity extraction is enhanced with deslexicalization to improve fine-grained recognition in the municipal minutes domain.
- The authors benchmark both open-weight (Phi) and closed-weight (Gemini) LLMs, comparing predictive performance alongside inference cost and carbon footprint.
- Results show strong in-domain accuracy but weaker cross-municipality generalization due to linguistic complexity and document variability, and the work also establishes the first benchmark for this metadata-extraction task.
Related Articles

Black Hat Asia
AI Business

Mr. Chatterbox is a (weak) Victorian-era ethically trained model you can run on your own computer
Simon Willison's Blog
Beyond the Chatbot: Engineering Multi-Agent Ecosystems in 2026
Dev.to

I missed the "fun" part in software development
Dev.to

The Billion Dollar Tax on AI Agents
Dev.to