Question

我正在Spacy网站上完成教程练习。我已经完成了Matcher的练习，并且教程网站返回了预期的输出。当我将代码粘贴到工作笔记本电脑上的Jupyter笔记本中时，我也会得到预期的输出，但是当我在家用PC上运行代码时，我会得到其他东西。

import spacy

# Import the Matcher
from spacy.matcher import Matcher

nlp = spacy.load("en_core_web_sm")
doc = nlp("New iPhone X release date leaked as Apple reveals pre-orders by mistake")

# Initialize the Matcher with the shared vocabulary
matcher = Matcher(nlp.vocab)

# Create a pattern matching two tokens: "iPhone" and "X"
pattern = [{'TEXT': 'iPhone'}, {'TEXT': 'X'}]

# Add the pattern to the matcher
matcher.add("IPHONE_X_PATTERN", None, pattern)

# Use the matcher on the doc
matches = matcher(doc)
print("Matches:", [doc[start:end].text for match_id, start, end in matches])

预期结果是：

Matches: ['iPhone X']

但是我的家用计算机上的输出是：

Matches: ['New', 'iPhone', 'X', 'release', 'date', 'leaked', 'as', 'Apple', 'reveals', 'pre', '-', 'orders', 'by', 'mistake']

由len(matches)返回14确认。

我想我家里的设置有些不同，但是谁能确认？

Answer 1

也许是en_core_web_sm模型的另一个版本？

您可以运行python -m spacy validate并比较不同系统上的输出吗？

为什么在不同的PC上从Matcher教程获得不同的输出？

1 个答案: