Update LocalRunGuide.md

fixed one file that was not executable with chmod +x

also added language-pack installation and viewing.

Manually adding langpacks could also be useful but dont see the reason yet
This commit is contained in:
trytomakeyouprivate 2023-05-14 18:32:17 +00:00 committed by GitHub
parent d6cf4648a2
commit 42cc031200
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -52,11 +52,11 @@ sudo dnf install -y git automake autoconf libtool leptonica-devel pkg-config zli
### Step 2: Clone and Build jbig2enc (Only required for certain OCR functionality)
```bash
git clone https:github.com/agl/jbig2enc
cd jbig2enc
./autogen.sh
./configure
make
git clone https://github.com/agl/jbig2enc.git &&\
cd jbig2enc &&\
./autogen.sh &&\
./configure &&\
make &&\
sudo make install
```
@ -97,15 +97,16 @@ pip3 install opencv-python-headless
For Fedora:
```bash
sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf tesseract-osd
sudo dnf install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
pip3 install uno opencv-python-headless unoconv pngquant
```
### Step 4: Clone and Build Stirling-PDF
```bash
git clone https://github.com/Frooodle/Stirling-PDF.git
cd Stirling-PDF
git clone https://github.com/Frooodle/Stirling-PDF.git &&\
cd Stirling-PDF &&\
chmod +x ./gradlew &&\
./gradlew build
```
@ -117,18 +118,49 @@ You can move this file to a desired location, for example, `/opt/Stirling-PDF/`.
You must also move the Script folder within the Stirling-PDF repo that you have downloaded to this directory.
This folder is required for the python scripts using OpenCV
```bash
sudo mkdir /opt/Stirling-PDF &&\
sudo mv /build/libs/S-PDF-*.jar /opt/Stirling-PDF/ &&\
sudo mv scripts /opt/Stirling-PDF/ &&\
echo "Scripts installed."
```
### Step 6: Other files
#### OCR
If you plan to use the OCR (Optical Character Recognition) functionality, you might need to install language packs for Tesseract if running none english scanning.
If you plan to use the OCR (Optical Character Recognition) functionality, you might need to install language packs for Tesseract if running non-english scanning.
##### Installing Language Packs
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need. You can also use your repositories provided langpacks.
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata`
Please view [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html) for more info.
**IMPORTANT:** DO NOT REMOVE EXISTING `eng.traineddata`, IT'S REQUIRED.
Debian based systems, install languages with this command:
```bash
sudo apt update &&\
# All languages
# sudo apt install -y 'tesseract-ocr-*'
# Find languages:
apt search tesseract-ocr-
# View installed languages:
dpkg-query -W tesseract-ocr- | sed 's/tesseract-ocr-//g'
```
Fedora:
```bash
# All languages
# sudo dnf install -y tesseract-langpack-*
# Find languages:
dnf search -C tesseract-langpack-
# View installed languages:
rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
```
### Step 7: Run Stirling-PDF