Search

Important

Our commitment to providing a top-notch document management system and resources remain stronger than ever.

As part of this journey, this chapter has been expanded and migrated to the Knowledge base.

This means that you now have access to an even richer and more comprehensive range of resources to supercharge your document management journey.

Upgrade now to access this improved chapter as well as tutorials, troubleshooting guides, and in-dept articles covering all aspects of Mayan EDMS.

Note

Knowledge Base content

Articles:

Understanding the search lexicon
Mayan EDMS search system
The batch request API system

Tutorials:

How to enable Mayan EDMS to use Elasticsearch
Using extra worker containers
How to enable RabbitMQ’s administrative portal

Troubleshooting:

How to retrieve the list of available search models

Backends

Django

Path: mayan.apps.dynamic_search.backends.django.DjangoSearchBackend

This was the first backend supported. It uses the same database as the rest of the system to emulate a search engine.

As it uses the database, external services or reindexing to update its content is not required.

The downside to this backend is that it is slow and can overload the database affecting the entire performance of the deployment.

Since version 4.2 it is no longer the default search backend.

Unsupported features

Accent folding.
Case folding.
Fuzzy searches are emulated and might not return the same results as a search engine that has native support for fuzzy searches.

Whoosh

New in version 3.5.

Path: mayan.apps.dynamic_search.backends.whoosh.WhooshSearchBackend

This backend uses the Python Whoosh search library. Whoosh uses local files for indexing. Because of this, it runs in the same context as Mayan EDMS, no external services are required. Using and backing up Whoosh is very easy.

The downside to this backend is that it can only be used when Mayan EDMS is configure to use block storage. Mayan EDMS implementation of Whoosh also uses a distributed lock to avoid concurrent writing and possible corruption. This slows down the update process of the search index.

This backend provides search functionality that is simple to setup and will work well from small to intermediate installations.

In version 4.2, the Whoosh backend was completed and became the default search backend.

This engine support specialized date parsing. To use this feature, pass the date term as a raw term.

Example: =`['last tuesday' to 'next friday']`

More examples of date parsing can be found in https://whoosh.readthedocs.io/en/latest/dates.html#parsing-date-queries

To pass reserved characters or symbols that have special meaning to the preprocessor or to the Whoosh parsers, pass them as a raw term and also enclose them in single quotes.

Example: To search for the terms with the < symbol use =`'<'`

More details can be found in https://whoosh.readthedocs.io/en/latest/querylang.html#making-a-term-from-literal-text

ElasticSearch

New in version 4.2.

Path: mayan.apps.dynamic_search.backends.elasticsearch.ElasticSearchBackend

This backend uses ElasticSearch via a local API client. ElasticSearch must be deployed as an external service, either manually or automatically using the official Docker Compose file.

ElasticSearch can scale up very well and support millions of documents and many concurrent search requests. ElasticSearch can also be clustered to add more capabilities.

The downside is that ElasicSearch has high resource requirements and has an extensive but complex search syntax. Mayan EDMS only uses a subset of the search features provided by ElasticSearch.

This backend is recommended for large installations having a high number of documents and concurrent users.

Considerations

When changing the search backend, it is also necessary to launch the “Reindex search backend” action from the Tools menu to initialize the search engine index.

This action is only required once, afterwards the search engine will be updated as objects are added, removed, or edited.

Settings

SEARCH_BACKEND

Full path to the backend to be used to handle the search.

Default:
```
mayan.apps.dynamic_search.backends.whoosh.WhooshSearchBackend
```
SEARCH_BACKEND_ARGUMENTS

Arguments to pass to the search backend. For example values to change the behavior, host names, or authentication arguments.

Default:
```
{}
```
SEARCH_DEFAULT_OPERATOR

The search operator to use when none is specified.

Default:
```
AND
```
Choices:
```
AND,NOT,OR
```
SEARCH_DISABLE_SIMPLE_SEARCH

Disables the single term bar search leaving only the advanced search button.

Default:
```
false
```
Choices:
```
false,true
```
SEARCH_INDEXING_CHUNK_SIZE

Amount of objects to process when performing bulk indexing.

Default:
```
25
```
SEARCH_MATCH_ALL_DEFAULT_VALUE

Sets the default state of the “Match all” checkbox.

Default:
```
'False'
```
SEARCH_QUERY_RESULTS_LIMIT

Maximum number of search results to fetch and display per search query unit.

Default:
```
100000
```
SEARCH_RESULTS_LIMIT

Maximum number of search results to fetch and display.

Default:
```
1000
```
SEARCH_SAVED_RESULTSETS_PER_USER_LIMIT

Maximum number of saved resultsets to keep per user.

Default:
```
10
```
SEARCH_SAVED_RESULTSET_RESULTS_LIMIT

Maximum number of results to store per resultset.

Default:
```
1000
```
SEARCH_SAVED_RESULTSET_TIME_TO_LIVE

Time to keep the resultset in seconds.

Default:
```
300
```
SEARCH_SAVED_RESULTSET_TIME_TO_LIVE_INCREMENT

Amount to increase the time to live on each access of the resultset.

Default:
```
60
```
SEARCH_STORE_RESULTS_DEFAULT_VALUE

Sets the default state of the “Store results” checkbox.

Default:
```
false
```