Evaluate PP-OCRv6 online and integrate lightweight, production-ready OCR with PaddlePaddle, Transformers, or ONNX Runtime backends.
PP-OCRv6 is the latest generation of PaddleOCR’s universal OCR model family. It is designed for real-world text detection and recognition across documents, screenshots, multilingual images, digital displays, industrial labels, and scene text.
The model family can scale from 1.5 million to 34.5 million parameters and is divided into three tiers: small, small, and medium. The medium and small tiers support 50 languages, including Simplified Chinese, Traditional Chinese, English, Japanese, and 46 Latin script languages. Try PP-OCRv6 online now: PP-OCRv6 Online Demo.

In PaddleOCR’s official in-house multi-scenario OCR benchmark, PP-OCRv6_medium reaches 86.2% in detection Hmean and 83.2% in recognition accuracy. Compared to PP-OCRv5_server, it improves text detection by +4.6 percentage points and text recognition by +5.1 percentage points.

PP-OCRv6 focuses on practical OCR needs to produce accurate, structured text output using small-scale models and flexible deployment options. To learn more about why specialized OCR models remain useful in the VLM era, see our previous blog PP-OCRv5 on Hugging Face: A Specialized Approach to OCR.
New features of PP-OCRv6
PP-OCRv6 introduces architecture, training, and data improvements across detection and recognition. The main design goal is to improve OCR accuracy while maintaining a model size suitable for various deployment settings.
3 model layers
PP-OCRv6 provides three model layers covering various model sizes and OCR accuracy levels.
Model Model Size Detection Average Recognition Accuracy Typical Application Scenarios PP-OCRv6_tiny 1.5M Parameters 80.6% 73.5% Edge devices, lightweight local OCR, delay-sensitive demos, constrained environments PP-OCRv6_small 7.7M Parameters 84.1% 81.3% Mobile, desktop, balanced OCR services, low-cost multilingual OCR computing cost PP-OCRv6_medium 34.5M parameters 86.2% 83.2% Precision-oriented OCR, server-side pipeline, industrial OCR, document ingest, multilingual OCR
PPLCNetV4 backbone
PP-OCRv6 uses PPLCNetV4 as an integrated backbone for text detection and recognition.
The main benefit for developers is consistency across model families. The Small, Small, and Medium tiers are not unrelated models. They are part of the same OCR family and share a common architectural orientation.
RepLKFPN for text detection
Text detection is the first stage of the OCR pipeline. The quality of detection affects the crop sent to the recognizer, and poor crop quality often leads to poor recognition.
PP-OCRv6 upgrades the detection module with RepLKFPN, a lightweight large-kernel feature pyramid network designed for multiscale text detection while maintaining inference efficiency.
This is relevant for real-world OCR input where text may be small, dense, rotated, low resolution, or embedded in a complex background.

EncoderWithLightSVTR for recognition
For text recognition, PP-OCRv6 uses EncoderWithLightSVTR. Combining local context modeling and global attention to improve recognition quality for difficult text crops.
Recognition improvements are particularly relevant for multilingual text, screen text, industrial characters, special symbols, dense text, and noisy image areas.

Integrated multilingual OCR
The medium and small tiers support 50 languages in one model family, covering Simplified Chinese, Traditional Chinese, English, Japanese, and 46 Latin script languages.
This reduces the need to use separate OCR models across common multilingual OCR scenarios.
PaddleOCR quick start
Install PaddleOCR.
pip install paddleocr
Run OCR using Paddle Infernece (default backend).
from paddle oct import PaddleOCR ocr = PaddleOCR( use_doc_orientation_classify=error,use_doc_unwarping=error,use_textline_orientation=error) result = ocr.predict(“https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png”)
for less in Result: Resolved.print() res.save_to_img(“output”) res.save_to_json(“output”)
OCR results can be saved as visualization images and structured JSON output. Structured output can be used by downstream systems such as document parsing, search, extraction, RAG, analytics, and agent workflows.
Available inference backends
PP-OCRv6 can be used with multiple inference backends through PaddleOCR. PaddleOCR 3.7 provides a unified inference engine interface. This interface allows the engine to select the underlying runtime and pass related configuration via pipeline or module APIs.
Backend Description Transformers Hugging Face / PyTorch-oriented inference path for supported PaddleOCR models ONNX runtime Portable inference path for ONNX-based deployment environments Paddle Inference Native paddle inference format
For Hugging Face users, PaddleOCR supports running selected OCR and document parsing models using Transformers backends. This can be enabled with the following command:
Engine=“Transformers”
For more information on how the Transformers backend works with PaddleOCR, see:
PaddleOCR: Perform OCR and document parsing tasks using Transformers backends
Run the PP-OCRv6 sample using the Transformer backend.
from paddle oct import PaddleOCR ocr = PaddleOCR( use_doc_orientation_classify=error,use_doc_unwarping=error,use_textline_orientation=errorengine =“Transformers”) result = ocr.predict(“https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png”)
ONNX variants are also available in the PP-OCRv6 collection for environments that use the ONNX runtime through engine=”onnxruntime”.
from paddle oct import PaddleOCR ocr = PaddleOCR( use_doc_orientation_classify=error,use_doc_unwarping=error,use_textline_orientation=errorengine =“onnx runtime”) result = ocr.predict(“https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_002.png”)
Combining these backend options allows you to utilize PP-OCRv6 in a variety of runtime environments while maintaining the same OCR model family on Hugging Face Hub.
conclusion
PP-OCRv6 extends PaddleOCR with a family of lightweight, multilingual OCR models for real-world text detection and recognition.
This release includes three model layers from 1.5 million to 34.5 million parameters, OCR support for up to 50 languages, improved detection and recognition accuracy over PP-OCRv5_server, and multiple model formats on Hugging Face Hub including Safetensor, paddle inference models, and ONNX models.
Together with the hosted Hugging Face Space and the available PaddleOCR inference backend, PP-OCRv6 provides several entry points for evaluation and integration.
You can evaluate PP-OCRv6 with an online demo, explore available model assets in the collection, and use an inference backend that matches your own OCR workflow.

