Bootstrap

Elasticsearch Validate API

Elasticsearch Validate API,部分内容来自 B 站中华石杉 Elasticsearch 顶尖高手系列课程核心知识篇,英文内容来自官方文档。 其中大部分内容来自官方文档,英文,TL;DR

一般用在那种特别复杂庞大的搜索下,比如你一下子写了上百行的搜索,这个时候可以先用validate api去验证一下,搜索是否合法

GET /test_index/_validate/query?explain
{
  "query": {
    "math": {
      "test_field": "test"
    }
  }
}

{
  "valid" : false,
  "error" : "ParsingException[unknown query [math] did you mean [match]?]; nested: NamedObjectNotFoundException[[3:13] unknown field [math]];; org.elasticsearch.common.xcontent.NamedObjectNotFoundException: [3:13] unknown field [math]"
}

GET /test_index/_validate/query?explain
{
  "query": {
    "match": {
      "test_field": "test"
    }
  }
}

{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "valid" : true,
  "explanations" : [
    {
      "index" : "test_index",
      "valid" : true,
      "explanation" : "test_field:test"
    }
  ]
}
Validate API

validates a potentially expensive query without executing it. The query can be sent either as a path parameter or in the request body.

GET my-index-000001/_validate/query?q=user.id:kimchy

Request

Example

PUT my-index-000015

PUT my-index-000015/_bulk?refresh
{"index": {"_id":1}}
{"user": {"id": "kimchy"}, "@timestamp" : "2099-11-15T14:12:12", "message" : "trying out Elasticsearch"}
{"index": {"_id":2}}
{"user": {"id": "kimchi"}, "@timestamp" : "2099-11-15T14:12:13", "message" : "My user ID is similar to kimchy!"}

// sent a valid query
GET my-index-000015/_validate/query?q=user.id:kimchy

{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "valid" : true
}

// query may be sent in the request body
GET my-index-000015/_validate/query
{
  "query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "*.*"
        }
      },
      "filter": {
        "term" : { "user.id": "kimchy" }
      }
    }
  }
}

// Elasticsearch knows the post_date field should be a date due to dynamic mapping, dan foo does not corrrectly parse into a date:
GET my-index-000015/_validate/query
{
  "query": {
    "query_string": {
      "query": "@timestamp:foo",
      "lenient": false
    }
  }
}

{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "valid" : false
}

The explain paramter

An explain parameter can be specified to get more detailed information about why a query failed:

GET my-index-000015/_validate/query?explain=true
{
  "query": {
    "query_string": {
      "query": "@timestamp: foo",
      "lenient": false
    }
  }
}

{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "valid" : false,
  "explanations" : [
    {
      "index" : "my-index-000015",
      "valid" : false,
      "error" : "[my-index-000015/-dD_PeZjSLykm4LX5eCAHg] QueryShardException[failed to create query: failed to parse date field [foo] with format [strict_date_optional_time||epoch_millis]: [failed to parse date field [foo] with format [strict_date_optional_time||epoch_millis]]]; nested: ElasticsearchParseException[failed to parse date field [foo] with format [strict_date_optional_time||epoch_millis]: [failed to parse date field [foo] with format [strict_date_optional_time||epoch_millis]]]; nested: IllegalArgumentException[failed to parse date field [foo] with format [strict_date_optional_time||epoch_millis]]; nested: DateTimeParseException[Failed to parse with all enclosed parsers];; ElasticsearchParseException[failed to parse date field [foo] with format [strict_date_optional_time||epoch_millis]: [failed to parse date field [foo] with format [strict_date_optional_time||epoch_millis]]]; nested: IllegalArgumentException[failed to parse date field [foo] with format [strict_date_optional_time||epoch_millis]]; nested: DateTimeParseException[Failed to parse with all enclosed parsers];; java.lang.IllegalArgumentException: failed to parse date field [foo] with format [strict_date_optional_time||epoch_millis]"
    }
  ]
}

The rewrite parameter

With rewrite set to true, the explanation is more detailed showing the actual Lucene query that will be executed:

GET my-index-000015/_validate/query?rewrite=true
{
  "query": {
    "more_like_this": {
      "like": {
        "_id": "2"
      },
      "boost_terms": 1
      }
    }
  }
}

{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "valid" : true,
  "explanations" : [
    {
      "index" : "my-index-000015",
      "valid" : true,
      "explanation" : """MatchNoDocsQuery("empty BooleanQuery") -ConstantScore(_id:[fe 2f])"""
    }
  ]
}

Rewrite and all_shards parameters

By default, the request is executed on a single shard only, which is randomly selected. The detailed explanation of the query may depend on which shard is being hit, and therefore may vary from one request to another. So, in case of query rewrite the all_shards parameter should be used to get response from all available shards.

GET my-index-000015/_validate/query?rewrite=true&all_shards=true
{
  "query": {
    "match": {
      "user.id": {
        "query": "kimchy",
        "fuzziness": "auto"
      }
    }
  }
}

{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "valid" : true,
  "explanations" : [
    {
      "index" : "my-index-000015",
      "shard" : 0,
      "valid" : true,
      "explanation" : "(user.id:kimchi)^0.8333333 user.id:kimchy"
    }
  ]
}