A Robust Deep Learning Framework for Bangla License Plate Recognition Using YOLO and Vision-Language OCR
arXiv cs.CV / 3/12/2026
📰 NewsModels & Research
Key Points
- The paper introduces a robust Bangla license plate recognition system that combines a deep learning–based localization model with OCR to extract text, achieving 97.83% accuracy and an IoU of 91.3% on Bangla plates.
- It evaluates multiple object detection architectures, including U-Net and YOLO variants, and proposes a two-stage adaptive training strategy based on YOLOv8 to enhance localization performance.
- Text recognition is formulated as a sequence generation problem using a VisionEncoderDecoder framework, with ViT + BanglaBERT achieving a character error rate of 0.1323 and a word error rate of 0.1068.
- The framework demonstrates robustness across diverse real-world conditions and is positioned for deployment in intelligent transportation applications such as automated law enforcement and access control.
Related Articles

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.
Dev.to

My AI Does Not Have a Clock
Dev.to
How to settle on a coding LLM ? What parameters to watch out for ?
Reddit r/LocalLLaMA

Andrej Karpathy's autonomous AI research agent ran 700 experiments in 2 days and gave a glimpse of where AI is heading
Reddit r/artificial

So cursor admits that Kimi K2.5 is the best open source model
Reddit r/LocalLLaMA