Bootstrap

电商网站商品管理(二)多种搜索方式

内容来自中华石杉在B站的 Elasticsearch 顶尖高手系列课程核心知识篇,其中部分代码从 Elasticsearch 5.2 升级到 7.10.1

这是第二节,继续前面一节的内容,让你对 Elasticsearch 的搜索功能有一个大概的印象。

1. query string search

搜索全部商品:

GET /ecommerce/_search

{
  "took" : 58,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ecommerce",
        "_type" : "product",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "jiajieshi yagao",
          "desc" : "youxiao fangzhu",
          "price" : 25,
          "producer" : "jiajieshi producer",
          "tags" : [
            "fangzhu"
          ]
        }
      },
...

返回内容解析:

  • took:耗费了几毫秒

  • timed_out:是否超时,这里是没有

  • _shards:数据拆成了5个分片,所以对于搜索请求,会打到所有的primary shard(或者是它的某个replica shard也可以)

  • hits.total:查询结果的数量,3个document

  • hits.max_score:score的含义,就是document对于一个search的相关度的匹配分数,越相关,就越匹配,分数也高

  • hits.hits:包含了匹配搜索的document的详细数据

query string search 的由来:因为 search 参数都是以 http 请求的 query string  来附带的,如果做过 Web 开发,对于 query string 一定不陌生。

搜索商品名称中包含yagao 的商品,而且按照售价降序排序:

GET /ecommerce/_search?q=name:yagao&sort=price:desc

{
  "took" : 2418,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "ecommerce",
        "_type" : "product",
        "_id" : "2D1c23YB9ErK6TlW7Rni",
        "_score" : null,
        "_source" : {
          "name" : "shuermin yagao",
          "desc" : "mingan",
          "price" : 60,
          "producer" : "shuermin producer",
          "tags" : [
            "mingan"
          ]
        },
        "sort" : [
          60
        ]
      },
      ...

适用于临时的在命令行使用一些工具,比如curl,快速的发出请求,来检索想要的信息;但是如果查询请求很复杂,是很难去构建的,在生产环境中,几乎很少使用query string search

执行过程中发现,一般第一次执行语句会慢一些,第二次以及后面的执行会快很多,因该是有缓存的缘故。

2. query DSL

DSL:Domain Specified Language,特定领域的语言

http request body:请求体,可以用json的格式来构建查询语法,比较方便,可以构建各种复杂的语法,比 query string search 肯定强大多了

查询所有的商品

GET /ecommerce/_search
{
  "query":{"match_all":{}}
}

查询名称包含 yagao 的商品,同时按照价格降序排序

GET /ecommerce/_search
{
  "query" : {
    "match" : {
      "name":"yagao"
    }
  },
  "sort":[
    {"price":"desc"}
  ]
}

分页查询商品,总共3条商品,假设每页就显示1条商品,现在显示第2页,所以就查出来第2个商品

GET /ecommerce/_search
{
  "query":{"match_all":{}},
  "from":1,
  "size":1
}

注意这里是从 0 开始数的。

指定要查询出来商品的名称和价格就可以

GET /ecommerce/_search
{
  "query": {"match_all":{}},
  "_source":["name", "price"]
}

更加适合生产环境的使用,可以构建复杂的查询

3. query filter

搜索商品名称包含 yagao,而且售价大于 40 元的商品

GET /ecommerce/_search
{
  "query":{
    "bool":{
      "must":{
        "match":{
          "name":"yagao"
        }
      },
      "filter":{
        "range":{
          "price":{"gt":40}
        }
      }
    }
  }
}

4. full-text search(全文检索)

GET /ecommerce/_search
{
  "query":{
    "match":{
      "producer":"yagao producer"
    }
  }
}

尽量,无论是学什么技术,比如说你当初学 Java,学 Linux,学 shell,学 JavaScript,学Hadoop……一定自己动手,特别是手工敲各种命令和代码,切记切记,减少复制粘贴的操作。只有自己动手手工敲,学习效果才最好。

producer  这个字段,会先被拆解,建立倒排索引

yagao producer  --->  yagao 和 producer

5. phrase search(短语搜索)

跟全文检索相对应,相反,全文检索会将输入的搜索串拆解开来,去倒排索引里面去一一匹配,只要能匹配上任意一个拆解后的单词,就可以作为结果返回

phrase search,要求输入的搜索串,必须在指定的字段文本中,完全包含一模一样的,才可以算匹配,才能作为结果返回

GET /ecommerce/_search
{
  "query":{
    "match_phrase":{
      "producer":"yagao producer"
    }
  }
}

{
  "took" : 139,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

GET /ecommerce/_search
{
  "query":{
    "match_phrase":{
      "producer":"jiajieshi producer"
    }
  }
}

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.9162046,
    "hits" : [
      {
        "_index" : "ecommerce",
        "_type" : "product",
        "_id" : "2",
        "_score" : 0.9162046,
        "_source" : {
          "name" : "jiajieshi yagao",
          "desc" : "youxiao fangzhu",
          "price" : 25,
          "producer" : "jiajieshi producer",
          "tags" : [
            "fangzhu"
          ]
        }
      },
      {
        "_index" : "ecommerce",
        "_type" : "product",
        "_id" : "1",
        "_score" : 0.76588964,
        "_source" : {
          "name" : "update post version + 1",
          "desc" : "gaoxiao meibai",
          "price" : 30,
          "producer" : "special jiajieshi producer",
          "tags" : [
            "meibai",
            "fangzhu"
          ]
        }
      }
    ]
  }
}

6. highlight search(高亮搜索结果)

GET /ecommerce/_search
{
  "query":{
    "match":{
      "producer":"producer"
    }
  },
  "highlight":{
    "fields":{
      "producer":{}
    }
  }
}

{
  "took" : 2179,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 5,
      "relation" : "eq"
    },
    "max_score" : 0.08701137,
    "hits" : [
      {
        "_index" : "ecommerce",
        "_type" : "product",
        "_id" : "2",
        "_score" : 0.08701137,
        "_source" : {
          "name" : "jiajieshi yagao",
          "desc" : "youxiao fangzhu",
          "price" : 25,
          "producer" : "jiajieshi producer",
          "tags" : [
            "fangzhu"
          ]
        },
        "highlight" : {
          "producer" : [
            "jiajieshi producer"
          ]
        }
      },
...