如何解析由换行符分隔的标记,如下所示:
Wolff PERSON
is O
in O
Argentina LOCATION
The O
US LOCATION
Envoy O
noted O
使用python进入这样的完整句子?
Wolff is in Argentina
The US Envoy noted
答案 0 :(得分:1)
您可以使用itertools.groupby
:
>>> from StringIO import StringIO
>>> from itertools import groupby
>>> s = '''Wolff PERSON
is O
in O
Argentina LOCATION
The O
US LOCATION
Envoy O
noted O'''
>>> c = StringIO(s)
>>> for k, g in groupby(c, key=str.isspace):
if not k:
print ' '.join(x.split(None, 1)[0] for x in g)
...
Wolff is in Argentina
The US Envoy noted
如果输入实际上来自字符串而不是文件,那么:
for k, g in groupby(s.splitlines(), key= lambda x: not x.strip()):
if not k:
print ' '.join(x.split(None, 1)[0] for x in g)
...
Wolff is in Argentina
The US Envoy noted