_update_by_query无法更新ElasticSearch中的所有文档

时间:2019-11-25 12:33:05

标签: elasticsearch elasticsearch-painless

我在Elasticsearch(版本-6.3.3)中有超过3000万个文档,我正在尝试向所有现有文档中添加新字段并将其值设置为0。

例如:我想添加start文档中以前不存在的Twitter字段,并将其在所有3000万文档中的初始值设置为0。

就我而言,我只能更新400万个。如果我尝试使用TASK API http://localhost:9200/_task/{taskId}检查已提交的任务,则结果类似->

{
  "completed": false,
  "task": {
    "node": "Jsecb8kBSdKLC47Q28O6Pg",
    "id": 5968304,
    "type": "transport",
    "action": "indices:data/write/update/byquery",
    "status": {
      "total": 34002005,
      "updated": 3618000,
      "created": 0,
      "deleted": 0,
      "batches": 3619,
      "version_conflicts": 0,
      "noops": 0,
      "retries": {
        "bulk": 0,
        "search": 0
      },
      "throttled_millis": 0,
      "requests_per_second": -1.0,
      "throttled_until_millis": 0
    },
    "description": "update-by-query [Twitter][tweet] updated with Script{type=inline, lang='painless', idOrCode='ctx._source.Twitter.start = 0;', options={}, params={}}",
    "start_time_in_millis": 1574677050104,
    "running_time_in_nanos": 466805438290,
    "cancellable": true,
    "headers": {}
  }
}

我针对ES执行的查询类似于:

curl -XPOST "http://localhost:9200/_update_by_query?wait_for_completion=false&conflicts=proceed" -H 'Content-Type: application/json' -d'
{
  "script": {
    "source": "ctx._source.Twitter.start = 0;"
  },
  "query": {
    "exists": {
      "field": "Twitter"
    }
  }
}'

任何建议都会很棒,谢谢

0 个答案:

没有答案