'字符串'具有错误的类型(预期的str,得到了spacy.tokens.doc.Doc)

时间:2018-12-03 06:25:41

标签: python nlp spacy

我有一个数据框:

train_review = train['review']
train_review

它看起来像:

0      With all this stuff going down at the moment w...
1      \The Classic War of the Worlds\" by Timothy Hi...
2      The film starts with a manager (Nicholas Bell)...
3      It must be assumed that those who praised this...
4      Superbly trashy and wondrously unpretentious 8...

我将令牌添加到字符串中:

train_review = train['review']
train_token = ''
for i in train['review']:
   train_token +=i

我想要使用Spacy将评论标记化。 这是我尝试的方法,但是出现以下错误:

  

参数'string'具有错误的类型(预期的str,得到了   spacy.tokens.doc.Doc)

我该如何解决?预先感谢!

1 个答案:

答案 0 :(得分:2)

在您的def window_ndim(a, wfunction): for axis, axis_size in enumerate(a.shape): window = wfunction(axis_size) for i in range(len(a.shape)): if i == axis: continue else: window = np.stack([window] * a.shape[i], axis=i) a *= window return a 循环中,您将从数据帧中获取spacy.token,并将其附加到字符串中,因此应将其强制转换为for。 像这样:

str