我有许多TSV文件作为Azure blob,后面有四个以制表符分隔的列:
metadata_path, document_url, access_date, content_type
我想按照此处所述对它们编制索引:https://docs.microsoft.com/en-us/azure/search/search-howto-index-csv-blobs
我创建索引器的请求包含以下内容:
{
"name" : "webdata",
"dataSourceName" : "webdata",
"targetIndexName" : "webdata",
"schedule" : { "interval" : "PT1H", "startTime" : "2017-01-09T11:00:00Z" },
"parameters" : { "configuration" : { "parsingMode" : "delimitedText", "delimitedTextHeaders" : "metadata_path,document_url,access_date,content_type" , "firstLineContainsHeaders" : true, "delimitedTextDelimiter" : "\t" } },
"fieldMappings" : [ { "sourceFieldName" : "document_url", "targetFieldName" : "id", "mappingFunction" : { "name" : "base64Encode", "parameters" : "useHttpServerUtilityUrlTokenEncode" : false } } }, { "sourceFieldName" : "document_url", "targetFieldName" : "url" }, { "sourceFieldName" : "content_type", "targetFieldName" : "content_type" } ]
}
我收到错误:
{
"error": {
"code": "",
"message": "Data source does not contain column 'document_url', which is required because it maps to the document key field 'id' in the index 'webdata'. Ensure that the 'document_url' column is present in the data source, or add a field mapping that maps one of the existing column names to 'id'."
}
}
我做错了什么?
答案 0 :(得分:0)
我做错了什么?
在您的情况下,您提供的json格式无效。以下是创建索引器的请求。我们可以参考此document
的详细信息{
"name" : "Required for POST, optional for PUT. The name of the indexer",
"description" : "Optional. Anything you want, or null",
"dataSourceName" : "Required. The name of an existing data source",
"targetIndexName" : "Required. The name of an existing index",
"schedule" : { Optional. See Indexing Schedule below. },
"parameters" : { Optional. See Indexing Parameters below. },
"fieldMappings" : { Optional. See Field Mappings below. },
"disabled" : Optional boolean value indicating whether the indexer is disabled. False by default.
}
如果我们想使用Rest API创建索引器。我们需要3个步骤才能做到这一点。我也为它做了一个演示。 如果Azure搜索SDK可以接受,您还可以引用另一个SO thread。
1.创建数据源。
POST https://[service name].search.windows.net/datasources?api-version=2015-02-28-Preview
Content-Type: application/json
api-key: [admin key]
{
"name" : "my-blob-datasource",
"type" : "azureblob",
"credentials" : { "connectionString" : "DefaultEndpointsProtocol=https;AccountName=<account name>;AccountKey=<account key>;" },
"container" : { "name" : "my-container", "query" : "<optional, my-folder>" }
}
{
"name" : "my-target-index",
"fields": [
{ "name": "metadata_path","type": "Edm.String", "key": true, "searchable": true },
{ "name": "document_url", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": false, "facetable": false },
{ "name": "access_date", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": false, "facetable": false },
{ "name": "content_type", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": false, "facetable": false }
]
}
答案 1 :(得分:0)
以下是有效的请求正文:
{
"name" : "webdata",
"dataSourceName" : "webdata",
"targetIndexName" : "webdata",
"schedule" :
{
"interval" : "PT1H",
"startTime" : "2017-01-09T11:00:00Z"
},
"parameters" :
{
"configuration" :
{
"parsingMode" : "delimitedText",
"delimitedTextHeaders" : "document_url,content_type,link_text" ,
"firstLineContainsHeaders" : true,
"delimitedTextDelimiter" : "\t",
"indexedFileNameExtensions" : ".tsv"
}
},
"fieldMappings" :
[
{
"sourceFieldName" : "document_url",
"targetFieldName" : "id",
"mappingFunction" : {
"name" : "base64Encode",
"parameters" : {
"useHttpServerUtilityUrlTokenEncode" : false
}
}
},
{
"sourceFieldName" : "document_url",
"targetFieldName" : "document_url"
},
{
"sourceFieldName" : "content_type",
"targetFieldName" : "content_type"
},
{
"sourceFieldName" : "link_text",
"targetFieldName" : "link_text"
}
]
}