Question

Rasa version - 1.3.7

pipeline: “supervised_embeddings”

我对机器人进行过训练，没有意图上的标点符号。

意图：ask_holiday_in_a_year

How many holidays do we have in a year?

如果我向机器人询问以下问题

一年中我们有多少假期？-（NLU能够识别正确）。
多少（）？假期！，做吧！@＃我们在％^＆年中有$％^。-（NLU是能够正确识别它。）
一年中我们有多少个######################假期？。无法正确识别它。）
一年中我们有多少个#######假期%% ^＆* $$％？。（NLU是无法正确识别它。）

对于案例1和2，它有效，但是对于案例3和4，它无效？我有什么办法（在管道中添加一些设置）来处理这些符号和标点并给出预期的结果？

Answer 1

我不确定这是否是正确的方法。但是，您可以尝试使用Chatito之类的工具来创建随机数据，同时在火车数据中包含符号。再次，我不确定这是否正确

Answer 2

首先，rasa根据您提供的内容识别示例。如果您有3到4句话的例子，rasa会认出来的。如果您认为开箱即用，可能会出现多个类似问题，而rasa不可能识别出这就是此类问题。因此，您想提供一些示例，这些示例与bot可能会提出的问题有关。

Answer 3

这可以使用自定义nlu组件进行处理。

在maketrans函数的第三个参数中，添加要删除的符号。此自定义管道将删除所有已定义的关键字，并将过滤后的文本发送到nlu。

from rasa.nlu.components import Component
import typing
from typing import Any, Optional, Text, Dict

if typing.TYPE_CHECKING:
    from rasa.nlu.model import Metadata


    class DeleteSymbols(Component):

        provides = ["text"]
        #requires = []
        defaults = {}
        language_list = None

        def __init__(self, component_config=None):
            super(DeleteSymbols, self).__init__(component_config)

        def train(self, training_data, cfg, **kwargs):
            pass

        def process(self, message, **kwargs):
            mt =  message.text
            message.text = mt.translate(mt.maketrans('', '', '$%&(){}^'))

        def persist(self, file_name: Text, model_dir: Text) -> Optional[Dict[Text, Any]]:
            pass

        @classmethod
        def load(
            cls,
            meta: Dict[Text, Any],
            model_dir: Optional[Text] = None,
            model_metadata: Optional["Metadata"] = None,
            cached_component: Optional["Component"] = None,
            **kwargs: Any
        ) -> "Component":
            """Load this component from file."""

            if cached_component:
                return cached_component
            else:
                return cls(meta)

在config.yml中添加管道

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: en
pipeline:
- name: "Pipelines.TextParsing.TextParsingPipeline"
- name: "WhitespaceTokenizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
- name: "EmbeddingIntentClassifier"

来源-https://forum.rasa.com/t/how-to-handle-punctuation-and-symbol-in-rasa/19454

如何在rasa中处理标点符号？

意图：ask_holiday_in_a_year

3 个答案: