Merge branch 'main' into patch-4
This commit is contained in:
commit
093dcba4ba
2 changed files with 28 additions and 3 deletions
|
@ -18,7 +18,7 @@ Depending on your requirements, you can choose the appropriate language pack for
|
|||
### Installing Language Packs
|
||||
|
||||
1. Download the desired language pack(s) by selecting the `.traineddata` file(s) for the language(s) you need.
|
||||
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata`
|
||||
2. Place the `.traineddata` files in the Tesseract tessdata directory: `/usr/share/tesseract-ocr/4.00/tessdata` (Debian) or `/usr/share/tesseract/tessdata` (Fedora)
|
||||
|
||||
# DO NOT REMOVE EXISTING ENG.TRAINEDDATA, ITS REQUIRED.
|
||||
|
||||
|
@ -48,4 +48,29 @@ Add the following to your existing docker run command
|
|||
If you are not using Docker, you need to install the OCR components, including the ocrmypdf app.
|
||||
You can see [OCRmyPDF install guide](https://ocrmypdf.readthedocs.io/en/latest/installation.html)
|
||||
|
||||
Debian based systems, install languages with this command:
|
||||
|
||||
```bash
|
||||
sudo apt update &&\
|
||||
# All languages
|
||||
# sudo apt install -y 'tesseract-ocr-*'
|
||||
|
||||
# Find languages:
|
||||
apt search tesseract-ocr-
|
||||
|
||||
# View installed languages:
|
||||
dpkg-query -W tesseract-ocr- | sed 's/tesseract-ocr-//g'
|
||||
```
|
||||
|
||||
Fedora:
|
||||
|
||||
```bash
|
||||
# All languages
|
||||
# sudo dnf install -y tesseract-langpack-*
|
||||
|
||||
# Find languages:
|
||||
dnf search -C tesseract-langpack-
|
||||
|
||||
# View installed languages:
|
||||
rpm -qa | grep tesseract-langpack | sed 's/tesseract-langpack-//g'
|
||||
```
|
||||
|
|
|
@ -92,8 +92,8 @@ Install the following software:
|
|||
For Debian-based systems, you can use the following command:
|
||||
|
||||
```bash
|
||||
sudo apt-get install -y libreoffice-core libreoffice-common libreoffice-writer libreoffice-calc libreoffice-impress python3-uno unoconv pngquant unpaper ocrmypdf
|
||||
pip3 install opencv-python-headless
|
||||
sudo apt-get install -y libreoffice-writer libreoffice-calc libreoffice-impress unpaper ocrmypdf
|
||||
pip3 install uno opencv-python-headless unoconv pngquant
|
||||
```
|
||||
|
||||
For Fedora:
|
||||
|
|
Loading…
Reference in a new issue