Merge branch 'main' into main
This commit is contained in:
commit
b1f8324c21
1 changed files with 54 additions and 9 deletions
|
@ -43,14 +43,20 @@ sudo apt-get update
|
||||||
sudo apt-get install -y git automake autoconf libtool libleptonica-dev pkg-config zlib1g-dev make g++ java-17-openjdk python3 python3-pip
|
sudo apt-get install -y git automake autoconf libtool libleptonica-dev pkg-config zlib1g-dev make g++ java-17-openjdk python3 python3-pip
|
||||||
```
|
```
|
||||||
|
|
||||||
|
For Fedora-based systems use this command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo dnf install -y git automake autoconf libtool leptonica-devel pkg-config zlib-devel make gcc-c++ java-17-openjdk python3 python3-pip
|
||||||
|
```
|
||||||
|
|
||||||
### Step 2: Clone and Build jbig2enc (Only required for certain OCR functionality)
|
### Step 2: Clone and Build jbig2enc (Only required for certain OCR functionality)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https:github.com/agl/jbig2enc
|
git clone https://github.com/agl/jbig2enc.git &&\
|
||||||
cd jbig2enc
|
cd jbig2enc &&\
|
||||||
./autogen.sh
|
./autogen.sh &&\
|
||||||
./configure
|
./configure &&\
|
||||||
make
|
make &&\
|
||||||
sudo make install
|
sudo make install
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -88,11 +94,19 @@ sudo apt-get install -y libreoffice-core libreoffice-common libreoffice-writer l
|
||||||
pip3 install opencv-python-headless
|
pip3 install opencv-python-headless
|
||||||
```
|
```
|
||||||
|
|
||||||
|
For Fedora:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
|
||||||
|
pip3 install uno opencv-python-headless unoconv pngquant
|
||||||
|
```
|
||||||
|
|
||||||
### Step 4: Clone and Build Stirling-PDF
|
### Step 4: Clone and Build Stirling-PDF
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/Frooodle/Stirling-PDF.git
|
git clone https://github.com/Frooodle/Stirling-PDF.git &&\
|
||||||
cd Stirling-PDF
|
cd Stirling-PDF &&\
|
||||||
|
chmod +x ./gradlew &&\
|
||||||
./gradlew build
|
./gradlew build
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -104,18 +118,49 @@ You can move this file to a desired location, for example, `/opt/Stirling-PDF/`.
|
||||||
You must also move the Script folder within the Stirling-PDF repo that you have downloaded to this directory.
|
You must also move the Script folder within the Stirling-PDF repo that you have downloaded to this directory.
|
||||||
This folder is required for the python scripts using OpenCV
|
This folder is required for the python scripts using OpenCV
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo mkdir /opt/Stirling-PDF &&\
|
||||||
|
sudo mv /build/libs/S-PDF-*.jar /opt/Stirling-PDF/ &&\
|
||||||
|
sudo mv scripts /opt/Stirling-PDF/ &&\
|
||||||
|
echo "Scripts installed."
|
||||||
|
```
|
||||||
### Step 6: Other files
|
### Step 6: Other files
|
||||||
#### OCR
|
#### OCR
|
||||||
If you plan to use the OCR (Optical Character Recognition) functionality, you might need to install language packs for Tesseract if running none english scanning.
|
If you plan to use the OCR (Optical Character Recognition) functionality, you might need to install language packs for Tesseract if running non-english scanning.
|
||||||
|
|
||||||
##### Installing Language Packs
|
##### Installing Language Packs
|
||||||
|
|
||||||
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
|
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need. You can also use your repositories provided langpacks.
|
||||||
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata`
|
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata`
|
||||||
Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info.
|
Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info.
|
||||||
**IMPORTANT:** DO NOT REMOVE EXISTING `eng.traineddata`, IT'S REQUIRED.
|
**IMPORTANT:** DO NOT REMOVE EXISTING `eng.traineddata`, IT'S REQUIRED.
|
||||||
|
|
||||||
|
Debian based systems, install languages with this command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo apt update &&\
|
||||||
|
# All languages
|
||||||
|
# sudo apt install -y 'tesseract-ocr-*'
|
||||||
|
|
||||||
|
# Find languages:
|
||||||
|
apt search tesseract-ocr-
|
||||||
|
|
||||||
|
# View installed languages:
|
||||||
|
dpkg-query -W tesseract-ocr- | sed 's/tesseract-ocr-//g'
|
||||||
|
```
|
||||||
|
|
||||||
|
Fedora:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# All languages
|
||||||
|
# sudo dnf install -y tesseract-langpack-*
|
||||||
|
|
||||||
|
# Find languages:
|
||||||
|
dnf search -C tesseract-langpack-
|
||||||
|
|
||||||
|
# View installed languages:
|
||||||
|
rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
|
||||||
|
```
|
||||||
|
|
||||||
### Step 7: Run Stirling-PDF
|
### Step 7: Run Stirling-PDF
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue