OCR Reader is a Windows desktop application for extracting text from PDFs and images using OCR (Tesseract engine). It supports template-driven parsing and exports structured data into CSV or TXT files.
Download the latest release package containing:
OCR_Reader/ ├── OCR_Reader.exe ├── config.yaml ├── tools/ │ ├── tesseract/ │ │ └── tesseract.exe │ └── poppler/ └── input/
Run OCR_Reader.exe. On first start the application will load configuration from config.yaml.
The application is configured using a config.yaml file.
tesseract_path: tools/tesseract/tesseract.exe
poppler_path: tools/poppler/bin
input_folder: input
output_folder: output
output_format: csv # csv | txt
language: eng
batch_mode: true
template:
name: invoice_template
fields:
invoice_number:
type: text
pattern: "Invoice No:\\s*(.*)"
date:
type: text
pattern: "Date:\\s*(.*)"
total:
type: number
pattern: "Total:\\s*([0-9.,]+)"
invoice_number, date, total INV-1023, 2026-04-10, 1250.50
For custom templates or integration help, contact:
nezval.software@gmail.com