我正在使用Spacy进行“ POS标记”,并出现以下错误。我有一个数据框,该数据框的“说明”列中,我需要为每个单词提取POS
数据框:
No. Description
1 My net is not working
2 I will be out for dinner
3 Can I order food
4 Wifi issue
代码:
import pandas as pd
read_data = pd.read_csv('C:\\Users\\abc\\def\\pqr\\Data\\training_data.csv', encoding="utf-8")
entity = []
for parsed_doc in read_data['Description']:
doc = nlp(parsed_doc)
a = [(X.text, X.tag_) for X in doc.ents]
entity.append(a)
上面的代码抛出错误:
错误:AttributeError:'spacy.tokens.span.Span'对象没有 属性“ tag _”
但是,相同的代码对于“标签”属性以及如果我使用单个句子也可以正常工作
doc = nlp('can you please help me to install wifi')
for i in doc:
print (i.text, i.tag_)
答案 0 :(得分:1)
这是因为ents
或chunks
之类的对象是Span,即令牌的集合。因此,您需要遍历它们的各个令牌以获取其属性,例如tag
或tag_
>>> doc = nlp(u'Mr. Best flew to New York on Saturday morning.')
>>> [(X.text, X.tag_) for X in doc.ents]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <listcomp>
AttributeError: 'spacy.tokens.span.Span' object has no attribute 'tag_'
>>> [(X.text, X.tag_) for Y in doc.ents for X in Y]
[('Best', 'NNP'), ('New', 'NNP'), ('York', 'NNP'), ('Saturday', 'NNP'), ('morning', 'NN')]