Removal of Ghostscript to use qpdf and tesseract directly (#2338)
* navbar fix multi tool and compress location * release notes and ghostscript removal * cleanups * formatting * update docs * more * more * docs * release bump * Hardening suggestions for Stirling-PDF / ghostscript (#2339) * Protect `readLine()` against DoS * Sanitized user-provided file names in HTTP multipart uploads --------- Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com> --------- Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>
This commit is contained in:
@@ -8,7 +8,7 @@ The paths have changed for the tessdata locations on new Docker images. Please u
|
||||
|
||||
## How does the OCR Work
|
||||
|
||||
Stirling-PDF uses [OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF), which in turn uses Tesseract for its text recognition. All credit goes to them for this awesome work!
|
||||
Stirling-PDF uses Tesseract for its text recognition. All credit goes to them for this awesome work!
|
||||
|
||||
## Language Packs
|
||||
|
||||
@@ -52,8 +52,6 @@ Add the following to your existing Docker run command:
|
||||
|
||||
### Non-Docker Setup
|
||||
|
||||
If you are not using Docker, you need to install the OCR components, including the `ocrmypdf` app. You can see the [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html).
|
||||
|
||||
For Debian-based systems, install languages with this command:
|
||||
|
||||
```bash
|
||||
@@ -83,8 +81,7 @@ rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
|
||||
|
||||
For Windows:
|
||||
|
||||
Ensure ocrmypdf in installed with
|
||||
``pip install ocrmypdf``
|
||||
You must ensure tesseract is installed
|
||||
|
||||
Additional languages must be downloaded manually:
|
||||
Download desired .traineddata files from tessdata or tessdata_fast
|
||||
|
||||
Reference in New Issue
Block a user