Bootstrap

电商网站商品管理(三)group by+avg+sort等聚合分析

内容来自B站中华石杉的《Elasticsearch 顶尖高手系列课程核心知识篇》,不知道别人怎么样,反正我是有点看不太明白,也记不住,不过好在这只是一个示例,后面还会有详细的讲解。

第一个分析需求

1. 计算每个 tag 下的商品数量

GET /ecommerce/_search
{
  "aggs":{
    "group_by_tags":{
      "terms": {"field":"tags"}
    }
  }
}

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [tags] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
...

将文本 field 的 fielddata 属性设置为true

POST ecommerce/_mapping
{
  "properties":{
    "tags":{
      "type":"text",
      "fielddata":true
    }
  }
}

GET ecommerce/_search
{
  "size":0,
  "aggs":{
    "all_tags":{
      "terms":{ "field":"tags" }
    }
  }
}

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "all_tags" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "fangzhu",
          "doc_count" : 2
        },
        ...

第二个聚合分析的需求

2. 对名称中包含牙膏的商品,计算每个 tag 下的商品数量

GET ecommerce/_search
{
  "size":0,
  "query":{
    "match":{
      "name":"yagao"
    }
  },
  "aggs":{
    "all_tags":{
      "terms":{
        "field":"tags"
      }
    }
  }
}

第三个聚合分析的需求

3. 先分组,再算每组的平均值,计算每个 tag 下的商品的平均价格

GET ecommerce/_search
{
  "size":0,
  "aggs":{
    "group_by_tags":{
      "terms":{"field":"tags"},
      "aggs":{
        "avg_price":{
          "avg":{"field":"price"}
        }
      }
    }
  }
}

{
  "took" : 103,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  ...

第四个数据分析需求

4. 计算每个tag下的商品的平均价格,并且按照平均价格降序排序

GET ecommerce/_search
{
  "size":0,
  "aggs":{
    "all_tags":{
      "terms":{
        "field":"tags", 
        "order":{"avg_price":"desc"}
      },
      "aggs":{
        "avg_price":{
          "avg":{"field":"price"}
        }
      }
    }
  }
}

{
  "took" : 119,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  ...

第五个数据分析需求:

5. 按照指定的价格范围区间进行分组,然后在每组内再按照 tag 进行分组,最后再计算每组的平均价格

GET ecommerce/_search
{
  "size":0,
  "aggs":{
    "group_by_price":{
      "range":{
        "field":"price",
        "ranges":[
          {"from":0, "to":20}, 
          {"from":20, "to":40},
          {"from":40, "to":60}
        ]
      },
      "aggs":{
        "group_by_tags":{
          "terms":{
            "field":"tags"
          },
          "aggs":{
            "average_price":{
              "avg":{
                "field":"price"
              }
            }
          }
        }
      }
    }
  }
}

{
  "took" : 62,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_price" : {
      "buckets" : [
        {
          "key" : "0.0-20.0",
          "from" : 0.0,
          "to" : 20.0,
          "doc_count" : 0,
          "group_by_tags" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [ ]
          }
        },
...