| OCR for redaction tasks are more difficult for VLMs in that accurate bounding boxes for every word on a page are essential to correctly obscure words on a page. Until recently, most VLMs (particularly open source) have not been good at this task. Early in February, I posted here my tests with Qwen 3 VL 8B Instruct for bounding box OCR and redaction tasks. With its high performance on handwritten text, it seemed like it had potential to fit into a redaction workflow. Since then, Qwen 3.5 arrived, and in this post I discuss some of my early tests with these models (full post link at bottom). Models and tasks for testing I tested out four Qwen models that can be used with < 24GB VRAM (Qwen 3 VL 8B, Qwen 3.5 9B, 35B A3B, and 27B), on three 'difficult' OCR/redaction tasks. For testing I used the doc_redaction open source repo, which is also linked in the post below.
Findings My conclusion is that of all the models I tried, Qwen 3.5 27B is the best local model available to fit into a redaction workflow. On Task 1, it was very good at reading the text content and encapsulating all words, see below: Task 1: Text identification and location with Qwen 3.5 27B (4-bit quantised) My only caveat on the performance of Qwen 3.5 27B on Task 1 is that I found with different quants/settings that sometimes the model would miss completely lines of text. This is a symptom of VLM 'laziness' that I see often on pages with lots of text. I would still advise having a human check the results of this approach. On Task 2, it successfully recognised two faces on the the page, but, as with the other models I tested, failed to fully cover the faces with a bounding box, resulting in a failed redaction: Task 2: Face identification and location with Qwen 3.5 27B (4-bit quantised) For Task 3, Qwen 3.5 27B performed well and correctly identified all relevant text and relative character positions (with some Python post-processing to help) with the following instructions: “Redact Lauren’s name (always cover the full name if available), email addresses, and phone numbers with the label LAUREN. Redact university names with the label UNIVERSITY. Always include the full university name if available.” Task 3: Redaction output for custom entity detection using Qwen 3.5 27B (4-bit quantised) In testing other models with this task, I found that anything smaller than ~27B models seem to struggle. Recommendations Qwen 3.5 27B was the best of the models I tested, and I think it is performant enough to now make it possible to perform redaction tasks using a VLM that you can run on a consumer GPU (24 GB VRAM or lower). Based on the above findings, this is what I would recommend for use with different tasks:
More details in the full post: OCR and redaction with Qwen 3.5 - full post with test results Has anyone else here tried using VLMs for redaction tasks? Have they been effective, and reliable? Are there any VLM models apart from the Qwen models that you have found useful for this? [link] [comments] |
Testing Qwen 3.5 for OCR and redaction tasks
Reddit r/LocalLLaMA / 3/29/2026
💬 OpinionTools & Practical UsageModels & Research
Key Points
- The post evaluates several Qwen vision-language models for OCR and document redaction, focusing on tasks that require precise word/line bounding boxes.
- Across three difficult scenarios (handwritten OCR with line/word boxes, detecting and fully covering faces, and finding custom entities for redaction), the author uses the doc_redaction open-source repository for testing.
- The author concludes that Qwen 3.5 27B is the best option among the tested locally runnable models for integrating into a redaction workflow.
- A key limitation is that even Qwen 3.5 27B can miss entire lines depending on quantization/settings, and the author recommends human verification to catch such failures.
- For face redaction, the models tend to recognize faces but still fail to fully cover them with bounding boxes, leading to unsuccessful redaction outcomes.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.




