tesseract: Open Source OCR Engine

Bindings to 'Tesseract': a powerful optical character recognition (OCR) engine that supports over 100 languages. The engine is highly configurable in order to tune the detection algorithms and obtain the best possible results.

Version: 5.2.2
Imports: Rcpp (≥ 0.12.12), pdftools (≥ 1.5), curl, rappdirs, digest
LinkingTo: Rcpp
Suggests: magick (≥ 1.7), spelling, knitr, tibble, rmarkdown
Published: 2024-10-04
DOI: 10.32614/CRAN.package.tesseract
Author: Jeroen Ooms ORCID iD [aut, cre]
tesseract author details
Maintainer: Jeroen Ooms <jeroenooms at gmail.com>
BugReports: https://github.com/ropensci/tesseract/issues
License: Apache License 2.0
URL: https://docs.ropensci.org/tesseract/ https://ropensci.r-universe.dev/tesseract
NeedsCompilation: yes
SystemRequirements: Tesseract >= 3.03 (libtesseract-dev / tesseract-devel) and Leptonica (libleptonica-dev / leptonica-devel). On Debian you need to install the English training data separately (tesseract-ocr-eng)
Language: en-US
Materials: NEWS
In views: NaturalLanguageProcessing
CRAN checks: tesseract results

Documentation:

Reference manual: tesseract.pdf
Vignettes: Using the Tesseract OCR engine in R (source, R code)

Downloads:

Package source: tesseract_5.2.2.tar.gz
Windows binaries: r-devel: tesseract_5.2.1.zip, r-release: tesseract_5.2.2.zip, r-oldrel: tesseract_5.2.2.zip
macOS binaries: r-release (arm64): tesseract_5.2.2.tgz, r-oldrel (arm64): tesseract_5.2.2.tgz, r-release (x86_64): tesseract_5.2.2.tgz, r-oldrel (x86_64): tesseract_5.2.2.tgz
Old sources: tesseract archive

Reverse dependencies:

Reverse suggests: camtrapR, imagerExtra, inlpubs, magick, pdftools, poldis

Linking:

Please use the canonical form https://CRAN.R-project.org/package=tesseract to link to this page.