Question

我派生了一个名为rasa_nlu的存储库来处理我要修改的部分代码：文件{{3中的函数component.train(...)内有一个函数train(...) }}似乎在未提供来源的情况下触发了警告，我想找到触发它的原因。

基本上它将此功能应用于组件列表：

[<rasa_nlu.utils.spacy_utils.SpacyNLP object at 0x7f3abbfbd780>, <rasa_nlu.tokenizers.spacy_tokenizer.SpacyTokenizer object at 0x7f3abbfbd710>, <rasa_nlu.featurizers.spacy_featurizer.SpacyFeaturizer object at 0x7f3abbfbd748>, <rasa_nlu.featurizers.regex_featurizer.RegexFeaturizer object at 0x7f3abbd1a630>, <rasa_nlu.extractors.crf_entity_extractor.CRFEntityExtractor object at 0x7f3abbd1a748>, <rasa_nlu.extractors.entity_synonyms.EntitySynonymMapper object at 0x7f3abbd1a3c8>, <rasa_nlu.classifiers.sklearn_intent_classifier.SklearnIntentClassifier object at 0x7f3abbd1a240>]

似乎最后一个触发了警告。

我试图修改存储库model.py文件中的函数train()，但它没有做任何更改，因此我怀疑它不合适。

无论如何，这里是文件train(...)中的代码model.py：

...

import rasa_nlu
from rasa_nlu import components, utils, config
from rasa_nlu.components import Component, ComponentBuilder
from rasa_nlu.config import RasaNLUModelConfig, override_defaults
from rasa_nlu.persistor import Persistor
from rasa_nlu.training_data import TrainingData, Message
from rasa_nlu.utils import create_dir, write_json_to_file

...

class Trainer(object):
    """Trainer will load the data and train all components.

    Requires a pipeline specification and configuration to use for
    the training."""

    # Officially supported languages (others might be used, but might fail)
    SUPPORTED_LANGUAGES = ["de", "en"]

    def __init__(self,
                 cfg,  # type: RasaNLUModelConfig
                 component_builder=None,  # type: Optional[ComponentBuilder]
                 skip_validation=False  # type: bool
                 ):
        # type: (...) -> None

        self.config = cfg
        self.skip_validation = skip_validation
        self.training_data = None  # type: Optional[TrainingData]

        if component_builder is None:
            # If no builder is passed, every interpreter creation will result in
            # a new builder. hence, no components are reused.
            component_builder = components.ComponentBuilder()

        # Before instantiating the component classes, lets check if all
        # required packages are available
        if not self.skip_validation:
            components.validate_requirements(cfg.component_names)

        # build pipeline
        self.pipeline = self._build_pipeline(cfg, component_builder)

    ...

    def train(self, data, **kwargs):
        # type: (TrainingData) -> Interpreter
        """Trains the underlying pipeline using the provided training data."""
        self.training_data = data

        context = kwargs  # type: Dict[Text, Any]

        for component in self.pipeline:
            updates = component.provide_context()
            if updates:
                context.update(updates)

        # Before the training starts: check that all arguments are provided
        if not self.skip_validation:
            components.validate_arguments(self.pipeline, context)

        # data gets modified internally during the training - hence the copy
        working_data = copy.deepcopy(data)
        for i, component in enumerate(self.pipeline):
            logger.info("Starting to train component {}"
                        "".format(component.name))
            component.prepare_partial_processing(self.pipeline[:i], context)
            print("before train")
            updates = component.train(working_data, self.config,
                                      **context)
            logger.info("Finished training component.")
            print("before updates")
            if updates:
                context.update(updates)
        return Interpreter(self.pipeline, context)

输出为

before train
before updates
before train
before updates
before train
before updates
before train
before updates
before train
before updates
before train
before updates
before train
Fitting 2 folds for each of 6 candidates, totalling 12 fits
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.1s finished
before updates
trainer.persist:

您可以在此处看到要捕获和修改的警告，以了解来源UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.

因此，您可以看到此警告来自何处吗？什么需要sklearn/metrics/classification.py？

Answer 1

这是Rasa NLU存储库中的一个已记录问题。我建议您关注这些问题或在此处添加您的评论以解决。其中一个被标记为“需要帮助”，这意味着他们正在寻找社区贡献者来解决这个问题。

上面链接的第一个问题中有关警告发生原因的tl：dr：

所以警告只是警告。它表明针对一种/某些意图的培训示例太少。添加更多示例将解决此问题（这就是为什么添加重复项会消除此警告，但实际上您应该添加其他示例）。

如果您想消除警告，请添加更多培训数据。使用评价.py脚本查找缺少的意图。

从警告消息中您可以看到它是由here文件sklearn/metrics/classification.py产生的。

修改似乎来自任何地方的警告

1 个答案: