Merge branch 'main' into patch-4
This commit is contained in:
commit
093dcba4ba
2 changed files with 28 additions and 3 deletions
|
@ -18,7 +18,7 @@ Depending on your requirements, you can choose the appropriate language pack for
|
||||||
### Installing Language Packs
|
### Installing Language Packs
|
||||||
|
|
||||||
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
|
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
|
||||||
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata`
|
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata` (Debian) or `/usr/share/tesseract/tessdata` (Fedora)
|
||||||
|
|
||||||
# DO NOT REMOVE EXISTING ENG.TRAINEDDATA, ITS REQUIRED.
|
# DO NOT REMOVE EXISTING ENG.TRAINEDDATA, ITS REQUIRED.
|
||||||
|
|
||||||
|
@ -48,4 +48,29 @@ Add the following to your existing docker run command
|
||||||
If you are not using Docker, you need to install the OCR components, including the ocrmypdf app.
|
If you are not using Docker, you need to install the OCR components, including the ocrmypdf app.
|
||||||
You can see [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html)
|
You can see [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html)
|
||||||
|
|
||||||
|
Debian based systems, install languages with this command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo apt update &&\
|
||||||
|
# All languages
|
||||||
|
# sudo apt install -y 'tesseract-ocr-*'
|
||||||
|
|
||||||
|
# Find languages:
|
||||||
|
apt search tesseract-ocr-
|
||||||
|
|
||||||
|
# View installed languages:
|
||||||
|
dpkg-query -W tesseract-ocr- | sed 's/tesseract-ocr-//g'
|
||||||
|
```
|
||||||
|
|
||||||
|
Fedora:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# All languages
|
||||||
|
# sudo dnf install -y tesseract-langpack-*
|
||||||
|
|
||||||
|
# Find languages:
|
||||||
|
dnf search -C tesseract-langpack-
|
||||||
|
|
||||||
|
# View installed languages:
|
||||||
|
rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
|
||||||
|
```
|
||||||
|
|
|
@ -92,8 +92,8 @@ Install the following software:
|
||||||
For Debian-based systems, you can use the following command:
|
For Debian-based systems, you can use the following command:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
sudo apt-get install -y libreoffice-core libreoffice-common libreoffice-writer libreoffice-calc libreoffice-impress python3-uno unoconv pngquant unpaper ocrmypdf
|
sudo apt-get install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
|
||||||
pip3 install opencv-python-headless
|
pip3 install uno opencv-python-headless unoconv pngquant
|
||||||
```
|
```
|
||||||
|
|
||||||
For Fedora:
|
For Fedora:
|
||||||
|
|
Loading…
Reference in a new issue