Elasticsearch 近实时搜索 Near Real-Time Search

2022-03-10 作者: escray

Elasticsearch 近实时搜索 Near Real-Time Search（refresh），内容来自 B 站中华石杉 Elasticsearch 顶尖高手系列课程核心知识篇，英文内容来自 Elasticsearch: The Definitive Guide [2.x]

优化写入流程实现NRT近实时

现有流程的问题，每次都必须等待 fsync 将 segment 刷入磁盘，才能将 segment 打开供 search 使用，这样的话，从一个 document 写入，到它可以被搜索，可能会超过1分钟，这就不是近实时的搜索了！主要瓶颈在于 fsync 时磁盘 IO 写数据进磁盘，是很耗时的。

原有流程

这个时候，改进流程的地方到了，ES 不会等待 fsync 将 OS Cache 中的数据刷入 OS Disk，才将 index segment 打开供 search 使用，而是 index segment 数据一到 OS Cache 中，就立即打开，供 search 使用。

每秒，buffer 被刷新到一个新的 index segment 中，所以每秒都会产生一个新的 index segment file

写入流程别改进如下

数据写入os cache，并被打开供搜索的过程，叫做refresh，默认是每隔1秒refresh一次。也就是说，每隔一秒就会将buffer中的数据写入一个新的index segment file，先写入os cache中。所以，es是近实时的，数据写入到可以被搜索，默认是1秒。

可以手动 refresh

POST /my_index/_refresh

一般不需要手动执行，没必要，让es自己搞就可以了

比如说，我们现在的时效性要求，比较低，只要求一条数据写入es，一分钟以后才让我们搜索到就可以了，那么就可以调整 refresh interval

PUT /my_index
{
  "settings": {
    "refresh_interval": "30s"
  }
}

那么，问题来了。Commit 操作在哪里？且听下回分解。

近实时搜索

Committing a new segment to disk requires an fsync to ensure that the segment is physically written to disk and that data will not be lost if there is a power failure. But an fsync is costly, ...

A Lucene index with new documents in the in-memory buffer

documents in the in-memory indexing buffer are written to a new segment. The new segment is written to the filesystem cache first - which is cheap - and only later is it flushed to disk - which is expensive. But once a file is in the cache, it can be opened and read, just like any other file.

Lucene allows new segments to be written and opened - making the documents they contain visible to search - without performing a full commit.

The buffer contents have been written to a segment, which is searchable, but is not yet committed

refresh API

In Elasticsearch, this lightweight process of writing and opening a new segment is called a refresh. By default, every shard is refreshed automatically once every second. This is why we say that Elasticsearch has near real-time search: document changes are not visible to search immediately, but will become visible within 1 second.

// Refresh all indices
POST /_refresh
// Refresh just the blogs index
POST /blogs/_refresh

While a refresh is much lighter than a commit, it still has a performance cost.

Not all use cases require a refresh every second... You can reduce the frequency of refreshes on a per-index basis by setting the refresh_interval:

PUT /my_logs
{
  "settings": {
    // Refresh the my_logs index every 30 seconds
    "refresh_interval": "30s"
  }
}

The refresh_interval can be updated dynamically on an existing index.

// Disable automatic refreshes
PUT /my_logs/_settings
{ "refresh_interval": -1 }

// Refresh automatically every second
PUT /my_logs/_settings
{ "refresh_interval": "1s" }

The refresh_interval expects a duration such as 1s (1 second) or 2m (2 minutes). An absolute number like 1 means 1 millisecond - a sure way to bring your cluster to its knees.

我觉的最有价值的地方是这里的调优思路，找到瓶颈 fsync，然后想办法优化。