REST API

Introduction

Mayan EDMS provides an HTTP REST Application Program Interface (or API). This API allows integration with 3rd party software using simple HTTP requests.

Several API authentication methods are provided: Session, Token, and HTTP Basic.

The URL for the API can be found via the Tools ‣ REST API menu. The API is also self-documenting. The live API documentation can be found in the Tools ‣ API Documentation (Swagger) menu for the Swagger version and in the Tools ‣ API Documentation (ReDoc) menu for the ReDoc version.

There are a few ways to structure REST APIs. In the case of Mayan EDMS, API endpoints are structured by resource type. Examples:

  • /cabinets - To view or create new cabinets

  • /cabinets/{id} - To view the details, edit, or delete an existing cabinet.

  • /cabinets/{id}/documents - To view, add, or remove documents from an existing cabinet.

  • /cabinets/{id}/documents/{id} - To view, add, or remove one document from an existing cabinet.

The API is versioned with a top level entry. Since Mayan uses semantic versioning and API changes only occur on major versions, the API version will match the major version of the release. For every release of the series 4.x the API version will be “v4”.

The API supports the HTTP verbs: GET, POST, PUT, PATCH, and DELETE.

The API supports result sorting. Like the user interface, only indexed database fields support. The OPTION verb can be used to determine which fields are sortable.

Dynamic fields

The schema of the responses can be modified to reduce the serialization performance penalty of complex API. This is supported via the URL query keys _fields_only and _fields_exclude. _fields_only will cause the schema to include only the fields listed while _fields_exclude works in reverse removing the fields listed.

Related fields from other serializers are supported using the double underscore (__) separator.

Batch API requests

In addition to single resource API requests, batch API requests are also supported. These allow making several API requests in a single HTTP request.

Advanced features like templating, iteration, referencing, and response grouping are also supported.

Batch API requests are encoded in JSON format and sent via POST to the /batch_requests/ endpoint as a list of dictionaries.

Each request service in the same order as it was submitted. As each request is resolved, its output is keep in a volatile context that is accessible to the subsequent requests. This way, one request can access or even iterate over the results of a previous request.

The structure of the an API batch request is as follows:

[
  {
    "body": {},
    "iterables": [],
    "include": "true",
    "group_name": "",
    "method": "GET",
    "name": "document_list",
    "url": "/api/v4/documents/"
  }
]

Body

The body attribute is a dictionary of values to be included in the request. The body is only used for PATCH, POST, and PUT requests.

This attribute supports templating and iteration.

Iterables

The iterables list is a string reference to lists in the context. Each reference can be nested using a dotted notation to traverse the hierarchy. For example to reference the results list of a dictionary named documents with the following structure {"results": [{"id": 1}, {"id": 2}], the reference notation would be "iterables": ["documents.results"]. A new iterable context variable named iterables.0 will be available for the rest of the fields of the request.

Specifying iterables will cause the request to be evaluated and resolved for as many items there are for the each iterable specified.

Iterables can also reference previous iterables to allow for nested loops. To reference a dictionary named document_metadata with the format:

{
  "results": [
    "document": {
      "id": 1,
      "metadata": [
        {
          "metadata_type_id": 1,
          "value": "1"
        },
        {
          "metadata_type_id": 2,
          "value": "2"
        }
      ]
    },
    {
      "document": {
        "id": 2
        "metadata": [
          {
            "metadata_type_id": 3,
            "value": "3"
          }
        ]
      }
    }
  ]
}

The iterables attribute would be as follows:

"iterables": ["document_metadata.results", "iterables.0.metadata"]

This will inject two context variables named iterables.0 which references document_metadata.results and iterables.1 which references every metadata item of each document in iterables.0. The request will be evaluated for each value of iterable.1 times the values of iterables.0.

The attribute does not support templating only direct references or iteration as it is an iterable itself.

Include

The include attribute controls if a requests response is rendered and returned or not. This attribute allows creating requests that are only meant to insert information in the context for use by other request but are of no interest.

This attribute does not support templating or iteration.

Method

The method attribute allows specifying the HTTP method. If not specified, the “GET” method will be used.

This attribute supports templating.

Name

The name attribute uniquely identifies the request and will be used to identify the corresponding response. The value of the name attribute will also be used to insert the response as a dictionary in the context.

This attribute supports templating and iteration.

For example, "name": "document_detail_for_{{ iterables.0.document.id }}".

Group name

The group_name attribute is used to keep all responses of a request that uses iteration and a variable name, as a list that can be accessed via a single group name.

This attribute does not support templating or iteration.

URL

The url attribute specifies the URL path to call. Only specify the path and the query. Example, "url": "/api/v4/documents/"

This attribute does supports templating and iteration.

For example, "url": "/api/v4/documents/{{ iterables.0.id }}/metadata/"

Additional context

The initial request that encapsulates all batch requests is include in the context as the variable view_request. It contains all attributes and values of a normal request including the logged in user, the site instance among other values injected by the middleware classes.

Responses

The response for each batch API request will contain the data returned as a dictionary, the headers as a dictionary, the name of the request, and the status code as an interger.

Each batch API request response will be enclosed in a top level response with following format:

{
  "count": 1,
  "next": "None",
  "previous": "None",
  "results": [
    {
      "data": {},
      "headers": {},
      "name": "",
      "status_code": 200
    }
  ]
}

Design decisions, limitations, caveats

Batch API requests share the same headers as the parent request made to the /batch_requests/ endpoint. This means that all requests in a batch will have the same authentication credentials and content type.

File uploads are not supported. Binary content in responses is returned as a base64 encoded string. In order to allow the encoding the raw data of the response is collapsed. This negates the advantage of the streaming responses used to deliver large binary content.

Batch API requests are limited by the page size. They do not have the ability to iterate over multiple result pages. Increase the page size with the ?page_size URL query to workaround this limitation if the expected response count will exceed the default response page size.

The url attribute only supports local paths, it is not possible to call external APIs.

Batch API requests does not use a traditional HTTP client. Instead it uses an artificial HTTP request which is used to call the views as functions removing the entire HTTP overhead from the process. This means that HTTP timeouts are not required. The only timeout are supported only for the initial POST request encapsulating the batch request as this initial request is a normal HTTP request.

Batch API requests are resolved in sequence, there is no support for parallel execution.

Error reporting is limited. To debug a batch API request, remove the last entries in the list and test until the expected results are obtained. Then add more requests in the list to increase its complexity. Include all responses at first, then as the batch complexity increases, switch the include attribute to false to reduce the response size.

Several of these limitations will be addressed in future updates.

Examples

Install the Python Requests library (http://docs.python-requests.org/en/master/):

pip install requests

Get a list of document types:

import requests

requests.get(url='http://127.0.0.1:8000/api/v4/document_types/', auth=('username', 'password')).json()

{'count': 1,
 'next': None,
 'previous': None,
 'results': [{'delete_time_period': 30,
 'delete_time_unit': 'days',
 'filename_generator_backend': 'uuid',
 'filename_generator_backend_arguments': '',
 'id': 1,
 'label': 'Default',
 'quick_label_list_url': 'http://127.0.0.1:8000/api/v4/document_types/1/quick_labels/',
 'trash_time_period': None,
 'trash_time_unit': None,
 'url': 'http://127.0.0.1:8000/api/v4/document_types/1/'}]}

Create a new document:

requests.post(url='http://127.0.0.1:8000/api/v4/documents/', auth=('username', 'password'), data={'document_type_id': 1, 'label': 'test document'}).json()

{'datetime_created': '2021-06-03T08:57:19.733359Z',
 'description': '',
 'document_change_type_url': 'http://127.0.0.1:8000/api/v4/documents/1/type/change/',
 'document_type': {'delete_time_period': 30,
  'delete_time_unit': 'days',
  'filename_generator_backend': 'uuid',
  'filename_generator_backend_arguments': '',
  'id': 1,
  'label': 'Default',
  'quick_label_list_url': 'http://127.0.0.1:8000/api/v4/document_types/1/quick_labels/',
  'trash_time_period': None,
  'trash_time_unit': None,
  'url': 'http://127.0.0.1:8000/api/v4/document_types/1/'},
 'file_list_url': 'http://127.0.0.1:8000/api/v4/documents/1/files/',
 'id': 1,
 'label': 'test document',
 'language': 'eng',
 'file_latest': None,
 'url': 'http://127.0.0.1:8000/api/v4/documents/1/',
 'uuid': '5f350d5a-cae1-4236-beba-dab187f112e3',
 'version_active': None,
 'version_list_url': 'http://127.0.0.1:8000/api/v4/documents/1/versions/'}

After creating a document, you need to upload a file to it. Use action code 1 to also have a new document version created and its pages pointing to the pages of the new file uploaded:

with open(file='test_document.pdf', mode='rb') as file_object:
    response = requests.post(url='http://127.0.0.1:8000/api/v4/documents/1/files/', auth=('username', 'password'), files={'file_new': file_object}, data={'action': 1})

response.status_code
202

The API will return status code 202 meaning it has accepted the request and will perform the processing in the background.

API Tokens

Use API tokens to avoid sending the username and password on every request. Obtain a token by making a POST request to /api/v4/auth/token/obtain/?format=json:

requests.post(url='http://127.0.0.1:8000/api/v4/auth/token/obtain/?format=json', data={'username': 'username', 'password': 'password'}).json()

{'token': '4ccbc35b5eb327aa82dc3b7c9747b578900f02bb'}

The API token may also be obtained using management commands:

./manage.py drf_create_token admin

Generated token 4ccbc35b5eb327aa82dc3b7c9747b578900f02bb for user admin

Add the API token to the request header:

headers = {'Authorization': 'Token 4ccbc35b5eb327aa82dc3b7c9747b578900f02bb'}

requests.get(url='http://127.0.0.1:8000/api/v4/document_types/', headers=headers).json()

{'count': 1,
 'next': None,
 'previous': None,
 'results': [{'delete_time_period': 30,
 'delete_time_unit': 'days',
 'filename_generator_backend': 'uuid',
 'filename_generator_backend_arguments': '',
 'id': 1,
 'label': 'Default',
 'quick_label_list_url': 'http://127.0.0.1:8000/api/v4/document_types/1/quick_labels/',
 'trash_time_period': None,
 'trash_time_unit': None,
 'url': 'http://127.0.0.1:8000/api/v4/document_types/1/'}]}

Sessions

Use sessions to avoid having to add the headers on each request:

session = requests.Session()

headers = {'Authorization': 'Token 4ccbc35b5eb327aa82dc3b7c9747b578900f02bb'}

session.headers.update(headers)

session.get(url='http://127.0.0.1:8000/api/v4/document_types/')

{'count': 1,
 'next': None,
 'previous': None,
 'results': [{'delete_time_period': 30,
 'delete_time_unit': 'days',
 'filename_generator_backend': 'uuid',
 'filename_generator_backend_arguments': '',
 'id': 1,
 'label': 'Default',
 'quick_label_list_url': 'http://127.0.0.1:8000/api/v4/document_types/1/quick_labels/',
 'trash_time_period': None,
 'trash_time_unit': None,
 'url': 'http://127.0.0.1:8000/api/v4/document_types/1/'}]}

Batch API requests

Search documents for a term and return the results along with their respective metadata types and values.

[
  {
    "name": "document_list",
    "url": "/api/v4/search/advanced/documents.DocumentSearchResult/?label=company AND xyz"
  },
  {
    "iterables": ["document_list.data.results"],
    "name": "document_metadata_{{ iterables.0.id }}",
    "url": "/api/v4/documents/{{ iterables.0.id }}/metadata/"
  }
]

Add the user ID as a document metadata.

[
  {
    "body": {"metadata_type_id": 5, "value": "{{ view_request.user.pk }}"},
    "method": "POST",
    "name": "document_metadata_set",
    "url": "/api/v4/documents/6/metadata/"
  }
]

Search documents for a term, retrieve the metadata values of the results, and patch the metadata values of the results. Return only the response of the last request for verification.

[
  {
    "include": "false",
    "name": "document_list",
    "url": "/api/v4/search/advanced/documents.DocumentSearchResult/?label=company AND xyz"
  },
  {
    "iterables": ["document_list.data.results"],
    "include": "false",
    "group_name": "document_metadata_get",
    "name": "document_metadata_get_{{ iterables.0.id }}",
    "url": "/api/v4/documents/{{ iterables.0.id }}/metadata/"
  },
  {
    "body": {"value": "1234"},
    "iterables": ["groups.document_metadata_get", "iterables.0.data.results"],
    "method": "PATCH",
    "name": "document_metadata_set_{{ iterables.1.document.id }}_{{ iterables.1.id }}",
    "url": "{{ iterables.1.url }}"
  }
]

Dynamic fields

For an fictional document API endpoint that returns:

{
  "datetime_created": "2021-10-30T11:08:14.893527Z",
  "description": "",
  "document_change_type_url": "http://127.0.0.1:8000/api/v4/documents/1/type/change/",
  "document_type": {
    "delete_time_period": 30,
    "delete_time_unit": "days",
    "filename_generator_backend": "uuid",
    "filename_generator_backend_arguments": "",
    "id": 1,
    "label": "Default",
    "quick_label_list_url": "http://127.0.0.1:8000/api/v4/document_types/1/quick_labels/",
    "trash_time_period": null,
    "trash_time_unit": null,
    "url": "http://127.0.0.1:8000/api/v4/document_types/1/"
  },
  "file_latest": {
    "checksum": "0cbfebd6414236de73c851bd8ac16b040e096b3c13c86d7b1024dcedf3291620",
    "comment": "",
    "document_url": "http://127.0.0.1:8000/api/v4/documents/1/",
    "download_url": "http://127.0.0.1:8000/api/v4/documents/1/files/1/download/",
    "encoding": "binary",
    "file": "e3139bfa-f734-42d5-b0b3-9afdf408580e",
    "filename": "lorem impsum",
    "id": 1,
    "mimetype": "application/pdf",
    "page_list_url": "http://127.0.0.1:8000/api/v4/documents/1/files/1/pages/",
    "size": 45726,
    "timestamp": "2021-10-30T11:08:14.972253Z",
    "url": "http://127.0.0.1:8000/api/v4/documents/1/files/1/"
  },
  "file_list_url": "http://127.0.0.1:8000/api/v4/documents/1/files/",
  "id": 1,
  "label": "lorem impsum",
  "language": "eng",
  "url": "http://127.0.0.1:8000/api/v4/documents/1/",
  "uuid": "49395648-96fa-492e-8f44-b568a456859a",
  "version_active": {
    "active": true,
    "comment": "",
    "document_url": "http://127.0.0.1:8000/api/v4/documents/1/",
    "export_url": "http://127.0.0.1:8000/api/v4/documents/1/versions/1/export/",
    "id": 1,
    "page_list_url": "http://127.0.0.1:8000/api/v4/documents/1/versions/1/pages/",
    "timestamp": "2021-10-30T11:08:15.778539Z",
    "url": "http://127.0.0.1:8000/api/v4/documents/1/versions/1/"
  },
  "version_list_url": "http://127.0.0.1:8000/api/v4/documents/1/versions/"
}

Using _fields_only=id,label will return:

{
  "id": 1,
  "label": "lorem impsum",
}

Related fields are supported. Using _fields_only=id,label,document_type__label,file_latest__id will return:

{
  "document_type": {
    "label": "Default",
  },
  "file_latest": {
    "id": 1,
  },
  "id": 1,
  "label": "lorem impsum",
}