Question

我想从带有pos-tagging的文本中提取一些功能。我的目标是在列表中检索Noun-Verb组合。对于POS标签，我使用了Spacy 现在我的代码看起来像这样：

from spacy.de import German
nlp = German() 
Verb = ["VERB"]  
NN = ["NOUN"]

sentence = [["Du musst folgendes tun: Scheibe schließen, Tuer oeffnen, Fenster", ["Das ist deine Loesung: Sitz zurückstellen"])

texts = somePreprocessing(sentence) #Tokenization, Stopword removal

list2 = []
verb_toks = []
noun_toks = []
verblist = []
nounlist = []
pairlist = []

for text in texts:
    for s in text:
         st = nlp(unicode(s))
         list.append(st)
         for word in st:
            if word.pos_ in Verb:
                verblist.append(word)
            if word.pos_ in NN:
                nounlist.append(word)
        if len(verblist) != 0 and len(nounlist) != 0:
        pairlist.append((verblist, nounlist))
        verblist = []
        nounlist = []

    list2.append(list)
    list = []
print verblist
print nounlist
print pairlist

输出应如下所示：[[“Scheibe”，“schließen”，“Tuer”，“oeffnen”，“Fenster”，“anheben”]，[“Sitz”，“zurückstellen”]

总结一下：
给出一个句子列表，如[[“这是一个例句”]，[“这是另一个例句”]。我的目的是基于POS标记检索这样的[[“名词”，“动词”，“名词”，“动词”，“名词”，“动词”，[“名词”，“动词]]等列表。

listOfSentence = [[".."],[".."]]
pos = posTagger(listOfSentences)
list = matchingNounVerb(pos)
print list
=> [["Noun", "Verb, "...", "..., "...", "...], ["Noun", "Verb]])

感谢您的帮助;）

在句子中使用POS标记查找名词 - 动词组合

0 个答案: