Question

这实际上是 2 个独立的问题，其中一个不太重要，但我会在最后补充。

我正在创建一个石灰解释器实例，由于某种原因，最高预测标签的输出与原始模型的输出不同，但实际概率是相同的。

这个模型是一个 yelp 评论星级预测器，你可以在这里看到标签及其概率的石灰解释器输出

这是原始模型的标签输出及其各自的概率。您可以看到这里的概率与上一张图像中的概率完全匹配

这是石灰模型的典型情况，还是我的实现明显有问题？我在这里发布代码。

import re
import lime.lime_text
import numpy as np
import webbrowser
from pathlib import Path
 
def strip_formatting(string):
    string = string.lower()
    string = re.sub(r"([.!?,'/()])", r" \1 ", string)
    return string
 
def tokenize_string(string):
    return string.split()
 
classifier = fasttext.load_model('fasttext_nlp_model.bin')
 
explainer = lime.lime_text.LimeTextExplainer(
    split_expression=tokenize_string,
    bow=False,
    class_names=["No Stars", "1 Star", "2 Stars", "3 Stars", "4 Stars", "5 Stars"]
)
#sort the probabilities for of labels in order from least to greatest and return all as list
def fasttext_prediction_in_sklearn_format(classifier, texts):
    res = []
    labels, probabilities = classifier.predict(texts, 10)
    for label, probs, text in zip(labels, probabilities, texts):
        order = np.argsort(np.array(label))
        res.append(probs[order])
 
    return np.array(res)
 
# Review to explain
# review = "So I was there last night and after I finished my food I go up to pay and tell the lady I'm paying for table #99, she tells me it's $99. Knowing it's a joke I smile and hand her my cash, but her expression doesn't change and she repeats the amount, at this point I'm starting to panic thinking, could I be mixed up with someone else's bill? As I'm getting nervous the lady finally smiles and admits to joking and runs me up cackling the whole time and I'm laughing and feeling foolish. I've always loved this place for its good food, great prices, and generous portions, and it's open late so I can even manage to eat leisurely after work, I recommend this place to all my friends."
review = "The food was great, kidding not really."
preprocessed_review = strip_formatting(review)
 
exp = explainer.explain_instance(
    preprocessed_review,
    classifier_fn=lambda x: fasttext_prediction_in_sklearn_format(classifier, x),
    #number of labels to explain
    top_labels=5,
    #number of words to look at
    num_features=7,
)
exp.show_in_notebook(text=True)

我的第二个问题很简单。在这一行中：labels, probabilities = classifier.predict(texts, 10)，当我使用一个小于 4 的数字时，exp.show_in_notebook 命令只是给出一个空白输出。为什么会这样？

LIME 解释器给出与实际模型预测不同的预测输出

0 个答案: