如何标记名词短语和其他单词?

时间:2018-08-12 16:40:49

标签: tokenize spacy

以下代码显示了如何打印所有标记或仅打印名词块。

$ ./main.py 
[Hello, ,, world, ., Here, are, two, sentences, .]
[two sentences]
$ cat main.py 
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:

import spacy
nlp = spacy.load('en')
doc = nlp(u'Hello, world. Here are two sentences.')
print [x for x in doc]
print [x for x in doc.noun_chunks]

如果我想迭代单词(当它们不在名词块中时)和名词块中,这是一种简单的方法吗? (在此示例中,我想要这样的东西。)

[Hello, ,, world, ., Here, are, two sentences, .]

0 个答案:

没有答案