我想从第二个令牌中删除\t
。
正在尝试循环但没有成功。有什么帮助吗?
import re
regex = re.compile(r'[\t]')
for sent in train_sents:
for tuples in sent:
print tuples[1]
[('O', 'Identification\t'),
('O', 'of\t'),
('O', 'APC2,\t'),
('O', 'a\t'),
('O', 'homologue\t'),
('O', 'of\t'),
('O', 'the\t'),
('B-DISEASE', 'adenomatous\t'),
('I-DISEASE', 'polyposis\t'),
('I-DISEASE', 'coli\t'),
('I-DISEASE', 'tumour\t'),
('O', 'suppressor\t'),
('O', '.\t')],
[('O', 'The\t'),
('B-DISEASE', 'adenomatous\t'),
('I-DISEASE', 'polyposis\t'),
('I-DISEASE', 'coli\t'),
('I-DISEASE', '(\t'),
('I-DISEASE', 'APC\t'),
('I-DISEASE', ')\t'),
('I-DISEASE', 'tumour\t'),
('O', '-suppressor\t'),
('O', 'protein\t'),
('O', 'controls\t'),
('O', 'the\t'),
('O', 'Wnt\t'),
('O', 'signalling\t'),
('O', 'pathway\t'),
('O', 'by\t'),
('O', 'forming\t'),
('O', 'a\t'),
('O', 'complex\t'),
('O', 'with\t'),
('O', 'glycogen\t'),
('O', 'synthase\t'),
('O', 'kinase\t'),
('O', '3beta\t'),
('O', '(\t'),
('O', 'GSK-3beta\t'),
('O', ')\t'),
('O', ',\t'),
('O', 'axin\t'),
('O', '/\t'),
('O', 'conductin\t'),
('O', 'and\t'),
('O', 'betacatenin\t'),
('O', '.\t')]
答案 0 :(得分:1)
replace()
在这里应该是有用的。见下文:
lst=[('O', 'signalling\t'),('O', 'kinase\t'),('try_yourself_first','happy_coding\t')]
for tup,i in zip (lst,range(0,len(lst))):
lst[i]=(tup[0],tup[1].replace('\t',''))
print(lst)
输出:
[('O', 'signalling'), ('O', 'kinase'), ('try_yourself_first', 'happy_coding')]