Question

我已经查看了ES文档并阅读了相关问题，但到目前为止，这些问题都没有对我有用。

基本上我有一个Json文件，用这种格式编写多个文件：

[ { 
    "account": "Sam420", 
    "language": null, 
    "watchers": 0, 
    "commits": 14, 
    "contributors": 2, 
    "stars": 0, 
    "rank": 16, 
}
{ 
    "account": "Kelly", 
    "language": null, 
    "watchers": 0, 
    "commits": 14, 
    "contributors": 2, 
    "stars": 0, 
    "rank": 16, 
} ]

我已尝试使用批量API向我的本地ES设置发布请求，并遵循以下正文格式：

 { "index": {} }
 { 
    "account": "Kelly", 
    "language": null, 
    "watchers": 0, 
    "commits": 14, 
    "contributors": 2, 
    "stars": 0, 
    "rank": 16, 
} 
{ "index": {} }
{ 
    "account": "Kelly", 
    "language": null, 
    "watchers": 0, 
    "commits": 14, 
    "contributors": 2, 
    "stars": 0, 
    "rank": 16, 
}

但是，我收到了解析器错误。当我按照以下数据将数据重新排列为一行时，它确实有效：

{ "index": { "_index": "folder" } }
{ "account": "Sam420", "language": null, ... }
{ "index": { "_index": "Canigan"} }
{ "account": "Kelly", "language": null, ... }

这是解析器错误：

{
    "error": {
    "root_cause": [
       {
          "type": "json_parse_exception",
          "reason": "Unexpected character (':' (code 58)): expected a      
                    valid value (number, String, array, object, 'true',     
                    'false' or 'null')\n at [Source: [B@6bd0ddf7; line: 
                    1, column: 10]"
        }],
           "type": "json_parse_exception",
           "reason": "Unexpected character (':' (code 58)): expected a 
                     valid value (number, String, array, object 'true', 
                    'false' or 'null')\n at [Source: [B@6bd0ddf7; line: 
                    1, column: 10]"
       },
       "status": 500
}

但是，我正在使用Github API中的100多个文档来提取回购数据，并且每个值都是垂直排列的。无需使用脚本重新格式化，我可以做什么来批量索引已经提供给我的Json格式的多个文档？如果没有，除了批量索引之外还有其他方法我可以用来一次索引多个文件吗？

Answer 1

5.5版的文档非常清楚： https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

因为这种格式使用文字\ n＆＃39;作为分隔符，所以请确保 JSON操作和源代码打印效果不佳。

你必须将对象布局为单行。

话虽如此，您并不需要复杂的脚本来重新格式化对象。您可以使用Notepad ++之类的东西替换＆＃34;，\ n＆＃34; （逗号然后换行）与＆＃34;，＆＃34; （逗号再空格）。然后像您一样交错索引/元数据行。

您可能还需要注意属性列表末尾的尾随逗号。

Answer 2

我认为它没有用，因为你没有关于索引和索引类型的信息。

{ "index": {"my_index", "my_index_type"} }
{ "account": "Sam420", "language": null, ... }
{ "index": {"my_index", "my_index_type"} }
{ "account": "Kelly", "language": null, ... }

在弹性搜索中，如何一次性批量索引Json文件多值文档？

2 个答案: