Projects

A from-scratch implementation of the CNN-RNN-CTC handwriting recognition stack — written to understand the pipeline rather than to chase a leaderboard.

Problem

OCR is usually treated as a closed API call. This project takes it apart: image preprocessing → convolutional feature extractor → recurrent sequence model → CTC alignment → decoded string, with no high-level OCR library in the path.

What I built

A PyTorch training pipeline on the IAM handwriting dataset (line-level), evaluated with Character Error Rate and Word Error Rate. A small inference web UI under OCR_WebApp/ for visual inspection.

Technical components

Framework
PyTorch
Architecture
CNN feature extractor → BiLSTM → CTC head
Data
IAM handwriting database, line-level
Metrics
Character Error Rate, Word Error Rate (held-out splits)
Decoding
Greedy + beam CTC decoding
Inference UI
Web app under OCR_WebApp/ for visual prediction inspection

Evidence / outputs

TODO: publish concrete CER / WER numbers on the IAM test split, a comparison row against a CRNN baseline, model size, and average per-line inference latency. The current README ships the architecture and evaluation harness but not a results table — this page intentionally does not invent numbers.

Current status

Experimental. The architecture and training loop are stable; results table on the repo is the open item.

Limitations

Repo

github.com/GioiaZheng/handwritten-ocr-system