OCR

OCR is not working for my specific language

Not all Tesseract OCR language files are included to avoid increasing the size of the Docker image.

To see which languages are supported, use the command:

apt-cache search tesseract-ocr

and install the language file with (German in this example):

apt-get install tesseract-ocr-deu

If using the Docker image, pass the environment variable MAYAN_APT_INSTALLS with the corresponding Tesseract language option. Example:

-e MAYAN_APT_INSTALLS='tesseract-ocr-deu'