无法识别ElasticSearch更新/禁用文档版本控制

时间:2019-01-21 09:37:06

标签: elasticsearch

我正在使用ElasticSearch 6.5.4-目前,我每天仅通过POST Ajax调用将数据直接一次写入多个索引中,而ID始终是相同的。然后,Elastic向我返回,如果ID不存在,则创建该文档;或者,如果ID存在,则更新该文档。

现在我遇到了版本控制问题,因为似乎值没有更新。通过ElasticHead扩展,我正在查看文档的详细信息,并且可以看到例如将文档存储为“版本9”,但是诸如索引时间戳之类的数据字段仍然是“版本1”中的数据字段。 / p>

我已经检查过了,实际上并没有存储所有版本-仅存储最新版本。 Elastic只是保留版本索引。所以我不太确定,为什么例如时间戳记不会更新。

我想要的基本上是禁用版本控制,或者以某种方式在每个帖子调用中告诉Elastic我正在索引的文档是当前文档,它应该仅使用当前文档进行显示。我发现的所有类似问题都不能完全复制我的问题,并且没有任何解决方案可以解决它。

因为我使用的是简单的POST Ajax呼叫,是否会发生此问题?是否可以通过将Logstash用作接收管道(我打算在项目的进一步进展中进行该操作)解决该问题?

示例:

Data sent to Elastic
ajax.call - method.post - Result: 200
https://elasticURL.com/article/123456
{
     "customer": "Customer",
     "source": "Source",
     "categories": "/Category1/Category2/Category3/",
     "title": "My Title",
     "articletype": "MyType",
     "rating": 100,
     "ratingnormalized": 5,
     "views": 1234,
     "author": "Author",
     "timeindexed": "21/1/2019 10:00",
     "timeindexedDate": "21/1/2019",
     "timecreated": "13/10/2018 11:22",
     "timecreatedDate": "13/10/2018 ",
     "timeupdated": "13/10/2018 11:22",
     "timeupdatedDate": "13/10/2018 ",
     "url": "https://www.google.com",
     "category1": "Category: 1",
     "category2": "Category: 2",
     "category3": "Category: 3",
     "text": "MyText ",
     "html": "<html></html>"
}
Data stored in Elastic
{
"_index": "index_name",
"_type": "article",
"_id": "123456",
"_version": 3,
"_score": 1,
"_source": {
     "customer": "Customer",
     "source": "Source",
     "categories": "/Category1/Category2/Category3/",
     "title": "My Title",
     "articletype": "MyType",
     "rating": 100,
     "ratingnormalized": 5,
     "views": 1234,
     "author": "Author",
     "timeindexed": "04/1/2019 01:05",
     "timeindexedDate": "04/1/2019",
     "timecreated": "13/10/2018 11:22",
     "timecreatedDate": "13/10/2018 ",
     "timeupdated": "13/10/2018 11:22",
     "timeupdatedDate": "13/10/2018 ",
     "url": "https://www.google.com",
     "category1": "Category: 1",
     "category2": "Category: 2",
     "category3": "Category: 3",
     "text": "MyText ",
     "html": "<html></html>"
  }
}

索引创建

PUT - https://elasticURL.com/index_name
{
    "settings" : {
        "index" : {
            "number_of_shards" : 5, 
            "number_of_replicas" : 0 
        }
    }
}

索引映射

PUT - https://elasticURL.com/index_name/_mappings
{
    "properties" : {
        "customer" : { "type" : "text" },
        "source" : { "type" : "text" },
        "categories" : { "type" : "text" },
        "articletype" : { "type" : "text" },    
        "title" : { "type" : "text" },
        "rating" : { "type" : "integer" },
        "ratingnormalized" : { "type" : "integer" },
        "views" : { "type" : "integer" },
        "author" : { "type" : "text" },
        "timeindexed" : { "type" : "date", "format" : "dd/MM/yyyy HH:mm"},
        "timeindexeddate" : { "type" : "date", "format" : "dd/MM/yyyy"},
        "timecreated" : { "type" : "date", "format" : "dd/MM/yyyy HH:mm"},
        "timecreateddate" : { "type" : "date", "format" : "dd/MM/yyyy"},
        "timeupdated" : { "type" : "date", "format" : "dd/MM/yyyy HH:mm"},
        "timeupdateddate" : { "type" : "date", "format" : "dd/MM/yyyy"},
        "url" : { "type" : "text" },
        "category1" : { "type" : "text" },
        "category2" : { "type" : "text" },
        "category3" : { "type" : "text" },
        "text" : { "type" : "text" },
        "html" : { "type" : "text" }
    }
}
+ keyword mapping for text fields, eg
        "text": {
            "type": "text",
            "fields": {
                "keyword": {
                    "ignore_above": 256,
                    "type": "keyword"
                }
            }
        },

快速总结一下; 例如,timeindexed字段不会在新的POST上更新以更新数据。仅版本增加。弹性指数显示为docs:4.498(5.320)。

0 个答案:

没有答案