5.2.1 - Fix shell script for cross compilation 5.2.0 - Windows: update to tesseract 5.3.2 5.1.0 - Win: update to tesseract 5.1.0. - Win: apply patch for freezes when running under UTF-8 in R-4.2. See: https://github.com/tesseract-ocr/tesseract/issues/3830 5.0.0 - Win/Mac: update to libtesseract 5.0.1 - Remove locale workaround on libtesseract 4.1+ (should only be needed for 4.0) - Remove cruft that was needed to support Solaris 4.2.0 - Prepare for API changes in upcoming Tesseract 5 release - Change the default language="eng" in tesseract() 4.1.2 - Fix for upstream master/main renames in language repos 4.1.1 - Win/Mac: update to libtesseract 4.1.1 4.1 - Fix memory leak in ocr_data() - Windows / MacOS: update to libtesseract 4.1.0. This re-enables the whitelist/blacklist options that were missing in Tesseract 4.0 4.0 - Windows, MacOS: Upgrade to upstream Tesseract 4.0! Completely new OCR engine. - Tesseract 4 has a new training data format. On Windows / MacOS you need to re-download your language data with tesseract_download(). The package uses separate directories for storing Tesseract 3 vs 4 data so they shouldn't get mixed up (hopefully). - Drop hard-dependency on tibble (only load if available) 2.3 - Fix problem with setlocale() not properly restoring locale. - Switch examples from dontrun{} to donttest{}, and '--run-donttest' on travis/appveyor 2.2 - Fixes for breaking changes in Tesseract 4.0.0 beta.3 - Set LC_ALL = C when initiating tesseract - Include to support Tesseract 4 2.1 - Fixes for 4.0.0-beta.1: they switched to semver + other data branch - Set LC_CTYPE to "C" when loading training data (required for some asian languages) - Add back OSD training data on Windows 2.0 - Set tesseract parameters at init so that all parameters types now actually work! - New function tesseract_params() lists all supported parameters and their default - Added 'config' argument to tesseract() which specifies a file with parameter values - Internally validate paremeter names before init to revent tesseract crashes - Rewrite the ocr_data() function in C++ to make it much faster - Tesseract 4 now gets data from the tessdata_fast repo as recommended upstream - Use default resolution of 300dpi when image does not contain resolution info 1.9 - Tesseract 4 now dowloads training data from the "tessdata_fast" repo - Add ocr_data() function that parses the hOCR output 1.8 - Add support for HOCR output (#20) - Remove 'script' and 'orientation' attributes in output (doesn't work in Tesseract 4) 1.7 (internal) - Add support upcoming Tesseract 4 (compiler fix + separate tessdata dir) - Configure script now explicitly tests for CXX11 (required by Tesseract 4) 1.6 - Windows: update libtesseract to 3.05.01 - tesseract_download now uses 3.04 tree (instead of 4.00) as suggested in readme - For static packags on Win/Mac, languages stored in: rappdirs::user_data_dir('tesseract') - Use 'png' instead of 'tiff' to read magick images - Compile with $(C_VISIBILITY) to hide internal symbols (requires Rcpp 0.12.12) - Use Rcpp symbol registration 1.4 - Run engine finalizer on R exit (requires Rcpp 0.12.10) - Move autobrew script to separate repository - Add symbol registration 1.3 - tesseract() gains an 'options' parameter for setting engine variables - New tessseract_download() function for installing training data on Win/Mac - Initiate default tesseract engine onAttach() to fail for missing training data - Add support for ocr() on magick images 1.2 - Try to fix build for CRAN OS-X, again. 1.1 - Try to fix build for CRAN OS-X build server - Show 'loaded' and 'available' languages in print.tesseract() 1.0 - Initial CRAN release