| by Arround The Web | No comments

Elasticsearch Multi-Get

This article will discuss on how to use the Elasticsearch multi-get API to fetch multiple JSON documents based on their IDs. In addition, Elasticsearch allows you to use a single get query to retrieve the documents from indices using only the document IDs.

Let’s explore.

Request Syntax

The following is the syntax for the Elasticsearch multi-get API:

GET /_mget
GET /<index>/_mget

The multi-get API supports multiple indices which allows you to fetch the documents even if they are not in the same index.

The request supports the following path parameters:

  1. <index> – The name of the index from which to retrieve the documents as specified by their IDs.

You can also specify the other query parameters as shown:

  1. Preference – Defines the preferred node or shard.
  2. Realtime – If set to true, the operation is performed in real-time.
  3. Refresh – Forces the operation to refresh the target shards before fetching the specified documents.
  4. Routing – A value that is used to route the operations to a specific shard.
  5. Store_fields – Retrieves the document fields stored in an index rather than the document.
  6. _source – A Boolean value that defines if the request should return the _source field or not.

The Query requires the body, which includes the following values:

  1. Docs – Specifies the documents that you wish to fetch. In addition, this section supports the following attributes:
    • _id – Unique ID of the target document.
    • _index – The index that contains the target document.
    • Routing – The key for the primary shard of the document.
    • _source – If true, it includes all source fields; otherwise, it excludes them.
    • _stored_fields – The stored_fields that you wish to include.
  2. Ids – The ids of the documents that you wish to fetch.

Example 1: Fetch Multiple Documents from the Same Index

The following example shows how to use the Elasticsearch multi-get API to retrieve the documents with specific IDs from the Netflix index:

curl -XGET "http://localhost:9200/netflix/_mget" -H "kbn-xsrf: reporting" -H "Content-Type: application/json" -d'
{
  "docs": [
    {
      "_id": "T3wnVoMBck2AEzXPytlJ"
    },
    {
      "_id": "W3wnVoMBck2AEzXPytlJ"
    }
  ]
}'

The given request should fetch the documents with the specified IDs from the Netflix index. The resulting output is as shown:

{
  "docs": [
    {
      "_index": "netflix",
      "_id": "T3wnVoMBck2AEzXPytlJ",
      "_version": 1,
      "_seq_no": 0,
      "_primary_term": 1,
      "found": true,
      "_source": {
        "duration": "90 min",
        "listed_in": "Documentaries",
        "country": "United States",
        "date_added": "September 25, 2021",
        "show_id": "s1",
        "director": "Kirsten Johnson",
        "release_year": 2020,
        "rating": "PG-13",
        "description": "As her father nears the end of his life, filmmaker Kirsten Johnson stages his death in inventive and comical ways to help them both face the inevitable.",
        "type": "Movie",
        "title": "Dick Johnson Is Dead"
      }
    },
    {
      "_index": "netflix",
      "_id": "W3wnVoMBck2AEzXPytlJ",
      "_version": 1,
      "_seq_no": 12,
      "_primary_term": 1,
      "found": true,
      "_source": {
        "country": "Germany, Czech Republic",
        "show_id": "s13",
        "director": "Christian Schwochow",
        "release_year": 2021,
        "rating": "TV-MA",
        "description": "After most of her family is murdered in a terrorist bombing, a young woman is unknowingly lured into joining the very group that killed them.",
        "type": "Movie",
        "title": "Je Suis Karl",
        "duration": "127 min",
        "listed_in": "Dramas, International Movies",
        "cast": "Luna Wedler, Jannis Niewöhner, Milan Peschel, Edin Hasanović, Anna Fialová, Marlon Boess, Victor Boccard, Fleur Geffrier, Aziz Dyab, Mélanie Fouché, Elizaveta Maximová",
        "date_added": "September 23, 2021"
      }
    }
  ]

}

We can also simplify the request by putting the document IDs in a simple array as shown in the following:

curl -XGET "http://localhost:9200/netflix/_mget" -H "kbn-xsrf: reporting" -H "Content-Type: application/json" -d'
{
     "ids": ["T3wnVoMBck2AEzXPytlJ", "W3wnVoMBck2AEzXPytlJ"]
}'

The previous request should perform a similar action.

Example 2: Fetch the Documents from Multiple Indicies

In the following example, the request fetches multiple documents from different indices as shown:

curl -XGET "http://localhost:9200/_mget" -H "kbn-xsrf: reporting" -H "Content-Type: application/json" -d'
{
  "docs": [
    {
      "_index": "netflix",
      "_id": "T3wnVoMBck2AEzXPytlJ"
    },
    {
      "_index": "disney",
      "_id": "8j4wWoMB1yF5VqfaKCE4"
    }
  ]
}'

The resulting output is as shown:

Example 3: Exclude Specific Fields

We can exclude specific fields from a given request using the source_include and source_exclude parameters.

An example is as shown:

curl -XGET "http://localhost:9200/_mget" -H "kbn-xsrf: reporting" -H "Content-Type: application/json" -d'
{
  "docs": [
    {
      "_index": "netflix",
      "_id": "T3wnVoMBck2AEzXPytlJ",
      "_source": false
    },
    {
      "_index": "netflix",
      "_id": "T3wnVoMBck2AEzXPytlJ",
      "_source": {
        "include": [ "listed_in", "release_year", "title" ],
        "exclude": [ "description", "type", "date_added" ]
      }
    }
  ]

}'

The given request uses the source include and exclude to specify which fields you wish to retrieve in a given document.

The resulting output is as shown:

Conclusion

In this post, we discussed the fundamentals of working with Elasticsearch multi-get API which allows you to fetch multiple documents from various sources based on their IDs. Feel free to explore the other documents for more information.

Happy coding!

Leave a Reply