Commit Graph

25 Commits

Author SHA1 Message Date
Ludy
9152e64b9f Remove convertBookTypeToPdf and Improve File Sanitization in FileToPdf (#3072)
# Description of Changes

Please provide a summary of the changes, including:

- **Removed `convertBookTypeToPdf` method**: 
- This method used `ebook-convert` from Calibre, which required external
dependencies.
- Its removal eliminates unnecessary process execution and simplifies
the codebase.
  
- **Enhanced `sanitizeZipFilename` function**:
  - Added handling for drive letters (e.g., `C:\`).
  - Ensured all slashes are normalized to forward slashes.
- Improved recursive path traversal removal to prevent directory escape
vulnerabilities.

- **Refactored `ProcessExecutor` output handling**:
  - Replaced redundant `.size() > 0` checks with `.isEmpty()`.
  
- **Expanded unit tests in `FileToPdfTest`**:
  - Added tests for `sanitizeZipFilename` to cover edge cases.
  - Improved test descriptions and added assertion messages.
  - Added debug print statements for easier test debugging.

---

## Checklist

### General

- [x] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [x] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/HowToAddNewLanguage.md)
(if applicable)
- [x] I have performed a self-review of my own code
- [x] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### UI Changes (if applicable)

- [ ] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [ ] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/DeveloperGuide.md#6-testing)
for more details.
2025-02-26 19:25:35 +00:00
Anthony Stirling
f5ca02df1d Dynamic paths for tools and removal of unused book endpoints (#3018)
# Description of Changes

This pull request includes several changes primarily focused on
improving configuration management, removing deprecated methods, and
updating paths for external dependencies. The most important changes are
summarized below:

### Configuration Management Improvements:
* Added a new `RuntimePathConfig` class to manage dynamic paths for
operations and pipeline configurations
(`src/main/java/stirling/software/SPDF/config/RuntimePathConfig.java`).
* Removed the `bookAndHtmlFormatsInstalled` bean and its associated
logic from `AppConfig` and `EndpointConfiguration`
(`src/main/java/stirling/software/SPDF/config/AppConfig.java`,
`src/main/java/stirling/software/SPDF/config/EndpointConfiguration.java`).
[[1]](diffhunk://#diff-4d774ec79aa55750c0a4739bee971b68877078b73654e863fd40ee924347e143L130-L138)
[[2]](diffhunk://#diff-750f31f6ecbd64b025567108a33775cad339e835a04360affff82a09410b697dL12-L35)
[[3]](diffhunk://#diff-750f31f6ecbd64b025567108a33775cad339e835a04360affff82a09410b697dL275-L280)

### External Dependency Path Updates:
* Updated paths for `weasyprint` and `unoconvert` in
`ExternalAppDepConfig` to use values from `RuntimePathConfig`
(`src/main/java/stirling/software/SPDF/config/ExternalAppDepConfig.java`).
[[1]](diffhunk://#diff-c47af298c07c2622aa98b038b78822c56bdb002de71081e102d344794e7832a6R12-L33)
[[2]](diffhunk://#diff-c47af298c07c2622aa98b038b78822c56bdb002de71081e102d344794e7832a6L104-R115)


### Minor Adjustments:
* Corrected a typo from "Unoconv" to "Unoconvert" in
`EndpointConfiguration`
(`src/main/java/stirling/software/SPDF/config/EndpointConfiguration.java`).

---

## Checklist

### General

- [ ] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [ ] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/DeveloperGuide.md)
(if applicable)
- [ ] I have read the [How to add new languages to
Stirling-PDF](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/HowToAddNewLanguage.md)
(if applicable)
- [ ] I have performed a self-review of my own code
- [ ] My changes generate no new warnings

### Documentation

- [ ] I have updated relevant docs on [Stirling-PDF's doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
(if functionality has heavily changed)
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

### UI Changes (if applicable)

- [ ] Screenshots or videos demonstrating the UI changes are attached
(e.g., as comments or direct attachments in the PR)

### Testing (if applicable)

- [ ] I have tested my changes locally. Refer to the [Testing
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/DeveloperGuide.md#6-testing)
for more details.
2025-02-23 13:36:21 +00:00
Abdur Rahman
507d21772d Fix issue #2842: Handle qpdf exit code 3 as success with warnings (#2883)
# Description of Changes

Please provide a summary of the changes, including:

- **What was changed**:
- Modified the `ProcessExecutor` class to accept exit code `3` from
**qpdf** as a success with warnings.
- Added a check to ensure that only **qpdf**’s exit code `3` is treated
as a warning.
- Added a warning log for **qpdf** exit code `3` to provide better
visibility into the repair process.

- **Why the change was made**:
- The repair process was failing when **qpdf** returned exit code `3`,
even though the operation succeeded with warnings. This caused
unnecessary errors for users.
- The changes ensure that PDFs with minor structural issues (e.g.,
mismatched object counts) are still repaired successfully, while logging
warnings for transparency.
- Added a check to ensure that only **qpdf**’s exit code `3` is treated
as a warning, preventing potential issues with other tools that might
use exit code `3` for actual errors.

Closes #2842

---

## Checklist

### General

- [x] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [x] I have read the [Stirling-PDF Developer
Guide](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/DeveloperGuide.md)
(if applicable)
- [x] I have performed a self-review of my own code
- [x] My changes generate no new warnings

### Testing (if applicable)

- [x] I have tested my changes locally.
- Verified that exit code `3` is only treated as a warning for **qpdf**
and not for other tools.

---

### Additional Notes
- The changes align with **qpdf**'s behavior, where exit code `3`
indicates a successful operation with warnings.
- Added a check to ensure that only **qpdf**’s exit code `3` is treated
as a warning, preventing potential issues with other tools.

Co-authored-by: Anthony Stirling <77850077+Frooodle@users.noreply.github.com>
2025-02-04 21:01:41 +00:00
Anthony Stirling
9884c65b10 formattingand autowired constructors (#2557)
# Description
This pull request includes several changes aimed at improving the code
structure and removing redundant code. The most significant changes
involve reordering methods, removing unnecessary annotations, and
refactoring constructors to use dependency injection.
Autowired now comes via constructor (which also doesn't need autowired
annotation as its done by default for configuration)



## Checklist

- [ ] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [ ] I have performed a self-review of my own code
- [ ] I have attached images of the change if it is UI based
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] If my code has heavily changed functionality I have updated
relevant docs on [Stirling-PDFs doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
- [ ] My changes generate no new warnings
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)
2024-12-24 09:52:53 +00:00
Ludy87
af100d4190 Remove Direct Logger and Use Lombok @Slf4j 2024-12-17 10:26:18 +01:00
Anthony Stirling
833b3c45c6 Removal of Ghostscript to use qpdf and tesseract directly (#2338)
* navbar fix multi tool and compress location

* release notes and ghostscript removal

* cleanups

* formatting

* update docs

* more

* more

* docs

* release bump

* Hardening suggestions for Stirling-PDF / ghostscript (#2339)

* Protect `readLine()` against DoS

* Sanitized user-provided file names in HTTP multipart uploads

---------

Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>

---------

Co-authored-by: pixeebot[bot] <104101892+pixeebot[bot]@users.noreply.github.com>
2024-11-26 20:50:35 +00:00
Rafael Encinas
7eea7fb3cb [Feature] Set Executor Instances limits dynamically from properties (#2193)
* Update 'ProcessExecutor.java' to use dynamic process limits from properties

* Move limits location out of 'application.properties'

* Rename 'SemaphoreLimit' to 'SessionLimit' and bundle with 'Timeout...' into one parent class
2024-11-07 00:43:57 +00:00
Anthony Stirling
eb526a5d0c logging and try catch 2024-06-02 11:59:43 +01:00
Anthony Stirling
f4fcede771 Update ProcessExecutor.java 2024-05-05 20:45:52 +01:00
Eric
dfb8c64f5a fix: switch to pdftohtml for pdf to html conversions (#998)
* fix: switch to pdftohtml for pdf to html conversions

* build: include poppler-utils in dockerfile for pdftohtml
2024-03-29 17:02:33 -04:00
Anthony Stirling
08e43cc89c fix #986 and #989 2024-03-28 17:09:21 +00:00
Anthony Stirling
ae73595335 Number of fixes and making pipline LIVE ! (#907)
Closes #889 and #332
#710
#901
#885
2024-03-13 19:15:10 +00:00
sbplat
4af58118c9 fix: use the same margins for x and y in the stamp feature 2024-02-07 21:40:33 -05:00
pixeebot[bot]
450e090252 Protect readLine() against DoS 2024-02-01 23:01:04 +00:00
Anthony Stirling
e717d83f75 fixes and timeouts 2024-01-10 00:33:07 +00:00
Anthony Stirling
ef12c2f892 Add ebook support 2024-01-09 22:39:21 +00:00
Anthony Stirling
5f771b7851 formatting 2023-12-30 19:11:27 +00:00
Anthony Stirling
b5b4636e56 changes to script executor and init 2023-07-29 13:53:30 +01:00
Anthony Stirling
4367ae7934 html and url to pdf init 2023-07-22 16:57:40 +01:00
Anthony Stirling
5bee714437 utf8 bug fix and scan pages (#113) 2023-05-01 21:57:48 +01:00
Anthony Stirling
78d3fd3768 format and move everything, other in own folder 2023-04-22 12:51:01 +01:00
Anthony Stirling
c311f9a4ed Convert PDF to Docx, powerpoint and others (#90) 2023-04-16 22:03:30 +01:00
Anthony Stirling
6d5dbd9729 Fixes and others (#83)
Features
-------------
Custom application name via APP_NAME docker env
(These next 3 are done with OCRMyPDF)
Extra features to OCR for scanned page cleanup (tilt/noise fixing)
Adding OCR ability to read and output to text file
Added Dedicated PDF/A conversion page 

Bug fixes
--------------
Fix concurrent calls on Libre and OCRMyPDF
jbig fix for compressions
Fix for compression metadata issues due to forced conversions to PDF/A


Other
--------
Removal of UK US language and just using "English" due to extra development time
Still issue with concurrent files for PDF to image... will fix later sorry
2023-04-01 21:02:54 +01:00
Anthony Stirling
a2a27e2216 Image stuff (#77)
Features
---------
Image to PDF supports multiple images, stretching and auto rotation
File inputs now only search for wanted file type
Settings now has a zip range so it can zip if you have more than x downloads (default 4)

extras
---------
DevTools support for easier development
Fix for temporary files for thread safety
2023-03-25 22:16:26 +00:00
Anthony Stirling
a9145fe84c Lots of changes (#70)
Image extraction and conversion to formats 

Multi parallel file execution for all forms so you can input multiple files quickly 

Any file at all pdf using libreoffice, super powerful
Sadly makes docker image larger but worth it 

OCR PDF using ocr my pdf
Works awesomely for adding text to a image

Improved compression using ocr my pdf app

Settings page with custom download options such as 
- open in same window
- open in new window
- download
- download as zip

Update detection in settings page it should show notification if there is a update (very hidden)

UI cleanups

Add other image formats to PDF to Image

Various fies to icons, and pdf.js usage
2023-03-20 21:55:11 +00:00