docker and ocr updates

2023-12-10 22:02:30 +00:00
parent 8b55ffff96
commit 59c7978330
28 changed files with 100 additions and 110 deletions
--- a/HowToUseOCR.md
+++ b/HowToUseOCR.md
@@ -2,6 +2,9 @@

 This document provides instructions on how to add additional language packs for the OCR tab in Stirling-PDF, both inside and outside of Docker.

+## My OCR used to work and now doesnt!
+Please update your tesseract docker volume path version from 4.00 to 5
+
 ## How does the OCR Work
 Stirling-PDF uses [OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF) which in turn uses tesseract for its text recognition.
 All credit goes to them for this awesome work! 
@@ -18,7 +21,7 @@ Depending on your requirements, you can choose the appropriate language pack for
 ### Installing Language Packs

 1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
-2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata` (Debian) or `/usr/share/tesseract/tessdata` (Fedora)
+2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/5/tessdata` (Debian) or `/usr/share/tesseract/tessdata` (Fedora)

 # DO NOT REMOVE EXISTING ENG.TRAINEDDATA, IT'S REQUIRED.

@@ -34,14 +37,14 @@ services:
  your_service_name:
    image: your_docker_image_name
    volumes:
-      - /location/of/trainingData:/usr/share/tesseract-ocr/4.00/tessdata
+      - /location/of/trainingData:/usr/share/tesseract-ocr/5/tessdata
 ```


 #### Docker run
 Add the following to your existing docker run command
 ```bash
-v /location/of/trainingData:/usr/share/tesseract-ocr/4.00/tessdata
+-v /location/of/trainingData:/usr/share/tesseract-ocr/5/tessdata
 ```

 #### Non-Docker