Merge branch 'main' into patch-4

This commit is contained in:
Anthony Stirling 2023-05-14 22:11:20 +01:00 committed by GitHub
commit 093dcba4ba
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
2 changed files with 28 additions and 3 deletions

View file

@ -18,7 +18,7 @@ Depending on your requirements, you can choose the appropriate language pack for
### Installing Language Packs
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata`
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata` (Debian) or `/usr/share/tesseract/tessdata` (Fedora)
# DO NOT REMOVE EXISTING ENG.TRAINEDDATA, ITS REQUIRED.
@ -48,4 +48,29 @@ Add the following to your existing docker run command
If you are not using Docker, you need to install the OCR components, including the ocrmypdf app.
You can see [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html)
Debian based systems, install languages with this command:
```bash
sudo apt update &&\
# All languages
# sudo apt install -y 'tesseract-ocr-*'
# Find languages:
apt search tesseract-ocr-
# View installed languages:
dpkg-query -W tesseract-ocr- | sed 's/tesseract-ocr-//g'
```
Fedora:
```bash
# All languages
# sudo dnf install -y tesseract-langpack-*
# Find languages:
dnf search -C tesseract-langpack-
# View installed languages:
rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
```

View file

@ -92,8 +92,8 @@ Install the following software:
For Debian-based systems, you can use the following command:
```bash
sudo apt-get install -y libreoffice-core libreoffice-common libreoffice-writer libreoffice-calc libreoffice-impress python3-uno unoconv pngquant unpaper ocrmypdf
pip3 install opencv-python-headless
sudo apt-get install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
pip3 install uno opencv-python-headless unoconv pngquant
```
For Fedora: