我正在使用LUIS Authoring API将培训数据发送到LUIS应用程序。我看到两个几乎相同的训练话语一致的索引错误:
POST数据
[
{
"text": "what are the main drivers for Aon",
"intentName": "QI.StockOverview",
"entityLabels": [
{
"entityName": "company",
"startCharIndex": 30,
"endCharIndex": 32
}
]
},
{
"text": "what are the main drivers for Concho Resources",
"intentName": "QI.StockOverview",
"entityLabels": [
{
"entityName": "company",
"startCharIndex": 30,
"endCharIndex": 45
}
]
}
]
我故意将endCharIndex
减少一个,假设它意味着"最多但不包括最后一个字符"在某些语言中很常见。但是,我仍然得到这样的答复:
[
{
"value": {
"ExampleId": 66579136,
"UtteranceText": "what are the main drivers for aon"
},
"hasError": false
},
{
"value": null,
"hasError": true,
"error": {
"code": "FAILED",
"message": "Example: \"what are the main drivers for Concho Resources\". Error: Index and length must refer to a location within the string.\r\nParameter name: length"
}
}
]
当切出字符串中不存在的索引时,该消息似乎是标准的C#错误。让我感到困惑的是,其中一个话语起作用而另一个话语起作用。
文档here说"两个索引都是从零开始计算的",我相信我做得不错。
任何提示?感谢
编辑:
我正在使用与this demo大致相同的Python脚本。我只进行了一次更改,即90个批次的POST,因为我有900个培训话语,每个请求的限制似乎是100.相关的方法是:
def add_utterances(self, filename=UTTERANCE_FILE):
with open(filename, encoding=self.UTF8) as utterance:
data = json.load(utterance)
chunk_size = 90
print(f'Adding {len(data)} utterances in chunks of {chunk_size}')
for i, chunk in enumerate(chunks(data, chunk_size)):
print('chunk', i)
self.call(self.EXAMPLES, self.POST, json.dumps(chunk))
print(self.result)
HTTP请求是
POST westus.api.cognitive.microsoft.com/luis/api/v2.0/apps/{app_id}/versions/{app_version}/examples
将JSON列表作为请求正文