Removal of Ghostscript to use qpdf and tesseract directly (#2338)

* navbar fix multi tool and compress location * release notes and ghostscript removal * cleanups * formatting * update docs * more * more * docs * release bump * Hardening suggestions for Stirling-PDF / ghostscript (#2339) * Protect `readLine()` against DoS * Sanitized user-provided file names in HTTP multipart uploads --------- Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com> --------- Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>
2024-11-26 20:50:35 +00:00
parent 654bc94d44
commit 833b3c45c6
69 changed files with 1106 additions and 665 deletions
--- a/LocalRunGuide.md
+++ b/LocalRunGuide.md
@@ -68,7 +68,7 @@ nix-env -iA nixpkgs.jbig2enc

 ### Step 3: Install Additional Software

-Next we need to install LibreOffice for conversions, ocrmypdf for OCR, and OpenCV for pattern recognition functionality.
+Next we need to install LibreOffice for conversions, qpdf for OCR, and OpenCV for pattern recognition functionality.

 Install the following software:

@@ -81,27 +81,27 @@ Install the following software:
 - unoconv
 - pngquant
 - unpaper
- ocrmypdf
+- qpdf
 - opencv-python-headless

 For Debian-based systems, you can use the following command:

 ```bash
-sudo apt-get install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
+sudo apt-get install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper qpdf
 pip3 install uno opencv-python-headless unoconv pngquant WeasyPrint --break-system-packages
 ```

 For Fedora:

 ```bash
-sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
+sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper qpdf
 pip3 install uno opencv-python-headless unoconv pngquant WeasyPrint
 ```

 For Nix:

 ```bash
-nix-env -iA nixpkgs.unpaper nixpkgs.libreoffice nixpkgs.ocrmypdf nixpkgs.poppler_utils
+nix-env -iA nixpkgs.unpaper nixpkgs.libreoffice nixpkgs.qpdf nixpkgs.poppler_utils
 pip3 install uno opencv-python-headless unoconv pngquant WeasyPrint
 ```

@@ -146,7 +146,6 @@ The easiest method is to use the language packs provided by your repositories. S

 1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
 2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tessdata`
-3. Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info.

 **IMPORTANT:** DO NOT REMOVE EXISTING `eng.traineddata`, IT'S REQUIRED.