我使用CLiPS pattern.search(Python 2.7)在文本中进行模式匹配。 我需要提取对应于'VBN NP'和'NP TO NP'的短语。 我可以单独完成,然后加入结果:
from pattern.en import parse,parsetree
from pattern.search import search
text="Published case-control studies have a lot of information about susceptibility to asthma."
sentenceTree = parsetree(text, relations=True, lemmata=True)
matches = []
for match in search("VBN NP",sentenceTree):
matches.append(match.string)
for match in search("NP TO NP",sentenceTree):
matches.append(match.string)
print matches
# Output: [u'Published case-control studies', u'susceptibility to asthma']
但我希望将id加入到一个搜索模式中。如果我试试这个,我根本就没有结果。
matches = []
for match in search("VBN NP|NP TO NP",sentenceTree):
matches.append(match.string)
print matches
#Output: []
Official documentation没有提供任何线索。我也试过'{VBN NP} | {NP TO NP}''[VBN NP] | [NP TO NP]',但没有任何运气。
问题是: 是否可以在CLiPS pattern.search中加入搜索模式? 如果回答是“是”那么该如何做?
答案 0 :(得分:0)
这种模式对我有用,{VBN NP} * + {NP TO NP},以及match()和group()方法
>>> from pattern.search import match
>>> from pattern.en import parsetree
>>> t = parsetree('Published case-control studies have a lot of information about susceptibility to asthma.',relations= True)
>>> m = match('{VBN NP} *+ {NP TO NP}',t)
>>> m.group(0) #matches the complete pattern
[Word(u'Published/VBN'), Word(u'case-control/NN'), Word(u'studies/NNS'), Word(u'have/VBP'), Word(u'a/DT'), Word(u'lot/NN'), Word(u'of/IN'), Word(u'information/NN'), Word(u'about/IN'), Word(u'susceptibility/NN'), Word(u'to/TO'), Word(u'asthma/NN')]
>>> m.group(1) # matches the first group
[Word(u'Published/VBN'), Word(u'case-control/NN')]
>>> m.group(2) # matches the second group
[Word(u'susceptibility/NN'), Word(u'to/TO'), Word(u'asthma/NN')]
最后你可以将结果显示为
>>> matches=[]
>>> for i in range(2):
... matches.append(m.group(i+1).string)
...
>>> matches
[u'Published case-control', u'susceptibility to asthma']