Rasa nlu解析请求提供错误的意图结果

时间:2017-09-28 09:56:17

标签: rasa-nlu

Rasa NLU版本(例如0.7.3):0.10.0a6

使用的后端/管道(mitie,spacy_sklearn,...):["nlp_spacy", "tokenizer_spacy", "intent_featurizer_spacy","ner_crf", "ner_synonyms", "intent_classifier_sklearn","ner_spacy"]

操作系统(windows,osx,...):Windows server 2012 R2

问题:我已经安装了Rasa nlu 0.10.0a6版本。我的confi_spacy文件看起来像。

{

"project":"Project",
"pipeline" : ["nlp_spacy", "tokenizer_spacy", "intent_featurizer_spacy","ner_crf", "ner_synonyms", "intent_classifier_sklearn","ner_spacy"],
"path" : "./projects",

"cors_origins":["*"],
"data" : "./data/examples/rasa/People.json"
}

我的数据文件看起来像。

{
  "rasa_nlu_data": {
    "regex_features": [
      {
        "name": "zipcode",
        "pattern": "[0-9]{5}"
      }
    ],
    "entity_synonyms": [
      {
        "value": "chinese",
        "synonyms": ["Chinese", "Chines", "chines"]
      },
      {
        "value": "vegetarian",
        "synonyms": ["veggie", "vegg"]
      }
    ],
    "common_examples": [
      {
        "text": "hey", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "howdy", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hey there",
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hello", 
        "intent": "greet", 
        "entities": []
      }, 
      {
        "text": "hi", 
        "intent": "greet", 
        "entities": []
      },
      {
        "text": "good morning",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "good evening",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "dear sir",
        "intent": "greet",
        "entities": []
      },
      {
        "text": "yes", 
        "intent": "affirm", 
        "entities": []
      }, 
      {
        "text": "yep", 
        "intent": "affirm", 
        "entities": []
      }, 
      {
        "text": "yeah", 
        "intent": "affirm", 
        "entities": []
      },
      {
        "text": "indeed",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "that's right",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "ok",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "great",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "right, thank you",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "correct",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "great choice",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "sounds really good",
        "intent": "affirm",
        "entities": []
      },
      {
        "text": "i'm looking for a place to eat",
        "intent": "restaurant_search",
        "entities": []
      },
      {
        "text": "I want to grab lunch",
        "intent": "restaurant_search",
        "entities": []
      },
      {
        "text": "I am searching for a dinner spot",
        "intent": "restaurant_search",
        "entities": []
      },
      {
        "text": "i'm looking for a place in the north of town",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 31,
            "end": 36,
            "value": "north",
            "entity": "location"
          }
        ]
      },
      {
        "text": "show me chinese restaurants",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 8,
            "end": 15,
            "value": "chinese",
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "show me chines restaurants",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 8,
            "end": 14,
            "value": "chinese",
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "show me a mexican place in the centre", 
        "intent": "restaurant_search", 
        "entities": [
          {
            "start": 31, 
            "end": 37, 
            "value": "centre", 
            "entity": "location"
          }, 
          {
            "start": 10, 
            "end": 17, 
            "value": "mexican", 
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "i am looking for an indian spot called olaolaolaolaolaola",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 20,
            "end": 26,
            "value": "indian",
            "entity": "cuisine"
          }
        ]
      },     {
        "text": "search for restaurants",
        "intent": "restaurant_search",
        "entities": []
      },
      {
        "text": "anywhere in the west",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 16,
            "end": 20,
            "value": "west",
            "entity": "location"
          }
        ]
      },
      {
        "text": "anywhere near 18328",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 14,
            "end": 19,
            "value": "18328",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I am looking for asian fusion food",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 17,
            "end": 29,
            "value": "asian fusion",
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "I am looking a restaurant in 29432",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 29,
            "end": 34,
            "value": "29432",
            "entity": "location"
          }
        ]
      },
      {
        "text": "I am looking for mexican indian fusion",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 17,
            "end": 38,
            "value": "mexican indian fusion",
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "central indian restaurant",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 0,
            "end": 7,
            "value": "central",
            "entity": "location"
          },
          {
            "start": 8,
            "end": 14,
            "value": "indian",
            "entity": "cuisine"
          }
        ]
      },
      {
        "text": "bye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "goodbye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "good bye", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "stop", 
        "intent": "goodbye", 
        "entities": []
      }, 
      {
        "text": "end", 
        "intent": "goodbye", 
        "entities": []
      },
      {
        "text": "farewell",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "Bye bye",
        "intent": "goodbye",
        "entities": []
      },
      {
        "text": "have a good one",
        "intent": "goodbye",
        "entities": []
      }
    ]
  }
}

使用上面的配置和json数据我已经使用以下HTTP端点

训练了Rasa

/火车?项目=项目

在使用经过训练的数据创建的一个Project文件夹中训练数据后。

我用以下命令启动了Rasa服务器。

python -m rasa_nlu.server -c config_spacy.json

现在服务器以端口5000启动。

当我可以执行'/ parse?q = hello& project = Project'解析终点时,我的结果低于响应。

{
  "intent": {
    "name": "greet",
    "confidence": 0.6409561289105246
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "greet",
      "confidence": 0.6409561289105246
    },
    {
      "name": "goodbye",
      "confidence": 0.16788352870824252
    },
    {
      "name": "restaurant_search",
      "confidence": 0.10908268742176423
    },
    {
      "name": "affirm",
      "confidence": 0.08207765495946878
    }
  ],
  "text": "hello"
}

当我可以执行'/ parse?q =很棒的选择& project =项目'解析终点得到低于响应。

{
  "intent": {
    "name": "affirm",
    "confidence": 0.7718580601897227
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "affirm",
      "confidence": 0.7718580601897227
    },
    {
      "name": "goodbye",
      "confidence": 0.11611828257295627
    },
    {
      "name": "greet",
      "confidence": 0.07060567364272623
    },
    {
      "name": "restaurant_search",
      "confidence": 0.04141798359459499
    }
  ],
  "text": "Great choice"
}

当我可以执行'/ parse?q =很棒的选择& Project = Project'解析终点时,我的反应低于响应。

{
  "intent": {
    "name": "None",
    "confidence": 1
  },
  "entities": [],
  "text": "Great choice"
}

当我可以执行“/ parse?q =预订cab& project = Project”解析终点时,我的响应低于响应。

{
  "intent": {
    "name": "goodbye",
    "confidence": 0.40930529216955336
  },
  "entities": [],
  "intent_ranking": [
    {
      "name": "goodbye",
      "confidence": 0.40930529216955336
    },
    {
      "name": "restaurant_search",
      "confidence": 0.31818118919270977
    },
    {
      "name": "greet",
      "confidence": 0.20524111006007764
    },
    {
      "name": "affirm",
      "confidence": 0.06727240857765926
    }
  ],
  "text": "Book a cab"
}

对于每个请求,它以某种方式响应,并且有时不响应。如果您可以在这些响应中观察到Parse_reponce2.txt和Parse_reponce3.txt,我刚刚从'小''p'变为大写'P',因为这个变化,我得到了不同的结果。

在训练有素的数据中,没有“预订出租车”文本或任何相关意图。但是当我使用这个文本发送解析时,我没有得到无意图,它返回意图结果。任何随机的解析请求都没有得到无意图。

这是我的训练问题还是出了什么问题。 请让我知道如何获得正确的意图结果以及实体结果。

配置文件的内容(如果使用和相关):

{

    "project":"Project",
    "pipeline" : ["nlp_spacy", "tokenizer_spacy", "intent_featurizer_spacy","ner_crf", "ner_synonyms", "intent_classifier_sklearn","ner_spacy"],
    "path" : "./projects",
    "cors_origins":["*"],
    "data" : "./data/examples/rasa/People.json"

}

2 个答案:

答案 0 :(得分:0)

网址参数区分大小写,这就是两个great choice示例具有不同输出的原因。在第二种情况下,Rasa没有找到要解析的项目/模型。

Rasa NLU将始终返回匹配的意图。所以在最后一个例子中你可以看到它返回了一个意图,但信心很低。处理这个就是所谓的fallback or out of scope。讨论处理回退的两种主要方法是实现在置信度低于某个阈值时接管的逻辑,或者训练具有您想要捕获的所有非意图示例的实际回退意图。

答案 1 :(得分:0)

我在使用Rasa NLU时遇到了同样的问题,其中我有5个不同意图的大约120个示例,以及5-7个实体。在这里,您似乎使用了spacy-sklearn管道。 spaCy通常需要更多数据来训练和检测意图以及(更多)实体。文档说,500-1000个例子对于图书馆来说会被认为是“体面和好”。

在我的情况下,我将管道更改为MITIE-sklearn,我得到了一个像以前一样只有80个示例和相同数量的意图训练的体面模型。 此外,spaCy往往会更快,正如您所指出的那样,但MITIE对于80个示例模型大约需要6分钟。