Fixes and others (#83)

Features
-------------
Custom application name via APP_NAME docker env
(These next 3 are done with OCRMyPDF)
Extra features to OCR for scanned page cleanup (tilt/noise fixing)
Adding OCR ability to read and output to text file
Added Dedicated PDF/A conversion page 

Bug fixes
--------------
Fix concurrent calls on Libre and OCRMyPDF
jbig fix for compressions
Fix for compression metadata issues due to forced conversions to PDF/A


Other
--------
Removal of UK US language and just using "English" due to extra development time
Still issue with concurrent files for PDF to image... will fix later sorry
This commit is contained in:
Anthony Stirling
2023-04-01 21:02:54 +01:00
committed by GitHub
parent 0b4e3de455
commit 6d5dbd9729
23 changed files with 531 additions and 537 deletions

View File

@@ -76,12 +76,15 @@ home.changeMetadata.desc=Change/Remove/Add metadata from a PDF document
home.fileToPDF.title=Convert file to PDF
home.fileToPDF.desc=Convert nearly any file to PDF (DOCX, PNG, XLS, PPT, TXT and more)
home.ocr.title=Run OCR on PDF
home.ocr.desc=Scans and detects text from images within a PDF and re-adds it as text.
home.ocr.title=Run OCR on PDF and/or Cleanup scans
home.ocr.desc=Cleanup scans and detects text from images within a PDF and re-adds it as text.
home.extractImages.title=Extract Images
home.extractImages.desc=Extracts all images from a PDF and saves them to zip
home.pdfToPDFA.title=Convert PDF to PDF/A
home.pdfToPDFA.desc=Convert PDF to PDF/A for long-term storage
navbar.settings=Settings
settings.title=Settings
@@ -93,15 +96,28 @@ settings.downloadOption.2=Open in new window
settings.downloadOption.3=Download file
settings.zipThreshold=Zip files when the number of downloaded files exceeds
#OCR
ocr.title=OCR
ocr.header=OCR (Optical Character Recognition)
ocr.title=OCR / Scan Cleanup
ocr.header=Cleanup Scans / OCR (Optical Character Recognition)
ocr.selectText.1=Select languages that are to be detected within the PDF (Ones listed are the ones currently detected):
ocr.selectText.2=Produce text file containing OCR text alongside the OCR'ed PDF
ocr.selectText.3=Correct pages were scanned at a skewed angle by rotating them back into place
ocr.selectText.4=Clean page so its less likely that OCR will find text in background noise. (No output change)
ocr.selectText.5=Clean page so its less likely that OCR will find text in background noise, maintains cleanup in output.
ocr.selectText.6=Ignores pages that have interacive text on them, only OCRs pages that are images
ocr.selectText.7=Force OCR, will OCR Every page removing all original text elements
ocr.selectText.8=Normal (Will error if PDF contains text)
ocr.selectText.9=Additional Settings
ocr.selectText.10=OCR Mode
ocr.help=Please read this documentation on how to use this for other languages and/or use not in docker
ocr.credit=This service uses OCRmyPDF and Tesseract for OCR.
ocr.submit=Process PDF with OCR
extractImages.title=Extract Images
extractImages.header=Extract Images
extractImages.selectText=Select image format to convert extracted images to
@@ -286,6 +302,10 @@ xlsToPdf.convert=convert
pdfToPDFA.title=PDF To PDF/A
pdfToPDFA.header=PDF To PDF/A
pdfToPDFA.credit=This service uses OCRmyPDF for PDF/A conversion
pdfToPDFA.submit=Convert