在python双字母

时间:2017-08-24 13:16:11

标签: python list tuples list-comprehension nested-lists

我有一份清单清单。我能够在内部列表中生成bigrams,如下所示:

  

[[('bacteria','agricultur'),('agricultur','soil'),('soil','presenc'),('presenc','sampl')],[('细菌' ','农业'),('农'','土壤'),('土壤','presenc'),('presenc','sampl')],[('nodul','uragensi')], [('nodul','stem'),('stem','nodul')],[('变形','morphoid')]]

现在,我需要将bigram元组中的逗号替换为我无法执行的下划线。所以,结果看起来应该是

  

[[(bacteria_agricultur),(agricultur_soil),(soil_presenc),(presenc_sampl)],[(bacteria_agricultur),(agricultur_soil),(soil_presenc),(presenc_sampl)],[(nodul_uragensi)],[(nodul_stem) ,(stem_nodul)],[('变形'_'形态')]]

当我使用join时,它会给我错误

texts = ["_".join(word) for word in texts]

错误:

TypeError: sequence item 0: expected str instance, tuple found

如何生成上述输出?感谢

2 个答案:

答案 0 :(得分:1)

您可以使用嵌套列表理解:

In [446]: [['_'.join(y) for y in x] for x in lst]
Out[446]: 
[['bacteria_agricultur', 'agricultur_soil', 'soil_presenc', 'presenc_sampl'],
 ['bacteria_agricultur', 'agricultur_soil', 'soil_presenc', 'presenc_sampl'],
 ['nodul_uragensi'],
 ['nodul_stem', 'stem_nodul'],
 ['deform_morphoid']]

如果你坚持使用括号,你也可以创建单元素元组:

In [447]: [[('_'.join(y), ) for y in x] for x in lst]
Out[447]: 
[[('bacteria_agricultur',),
  ('agricultur_soil',),
  ('soil_presenc',),
  ('presenc_sampl',)],
 [('bacteria_agricultur',),
  ('agricultur_soil',),
  ('soil_presenc',),
  ('presenc_sampl',)],
 [('nodul_uragensi',)],
 [('nodul_stem',), ('stem_nodul',)],
 [('deform_morphoid',)]]

答案 1 :(得分:0)

NewData=[]
for bigrams in lists:
    for grams in bigrams:
        NewData.append(str(grams).replace("'","").replace(", ","_")))