Python rstrip和split

时间:2017-10-20 01:51:52

标签: python split

您好我想从列表中删除条形换行符并获取每个元素的第三个单词。我正在使用split和rstrip。这是我的代码:

# obtain just compound ids from list
just_compound_id = []
for line in final_list:
    split_file = line.split(' ')
    split_file = line.rstrip()
    just_compound_id.append(split_file[2])
    print(just_compound_id)

但是我得到了一个非常奇怪的输出,就像这样

['I']
['I', 'I']
['I', 'I', 'I']
['I', 'I', 'I', 'I']
['I', 'I', 'I', 'I', 'I']

**编辑

这是我的输入

['UNIQUE-ID - ASN\n', 'UNIQUE-ID - D-GLT\n', 'UNIQUE-ID - 4-AMINO-
BUTYRATE\n', 'UNIQUE-ID - CPD-8569\n', 'UNIQUE-ID - CPD-17095\n', 'UNIQUE-ID 
- CPD-17880\n', 'UNIQUE-ID - GLY\n', 'UNIQUE-ID - CPD-18298\n', 'UNIQUE-ID - 
D-SERINE\n', 'UNIQUE-ID - ACETYLCHOLINE\n', 'UNIQUE-ID - DOPAMINE\n', 
'UNIQUE-ID - SEROTONIN\n', 'UNIQUE-ID - HISTAMINE\n', 'UNIQUE-ID - 
PHENYLETHYLAMINE\n', 'UNIQUE-ID - TYRAMINE\n', 'UNIQUE-ID - CPD-58\n', 
'UNIQUE-ID - 1-4-HYDROXYPHENYL-2-METHYLAMINOETHAN\n', 'UNIQUE-ID - 
TRYPTAMINE\n']

4 个答案:

答案 0 :(得分:1)

它应该是split_file.split(' ')而不是line.split(' '),您还需要在line.rstrip()之前执行split_file.split(' ')

just_compound_id = []
for line in final_list:
    split_file = line.rstrip()
    split_file = split_file.split(' ')
    just_compound_id.append(split_file[2])
    print(just_compound_id)

目前的方式,split_file的第一个作业没有任何效果,因为您没有使用它,而是在下一个作业中覆盖它。

答案 1 :(得分:0)

您可以使用列表理解:

[e.rstrip().split('-')[2] for e in finallist]

答案 2 :(得分:0)

如果输入更复杂,另一种选择是正则表达式。对于上述情况,我建议使用-拆分行并设置maxsplit参数。

final_list = ['UNIQUE-ID - ASN\n', 'UNIQUE-ID - D-GLT\n', 'UNIQUE-ID - 4-AMINO-BUTYRATE\n', 'UNIQUE-ID - CPD-8569\n', 'UNIQUE-ID - CPD-17095\n', 'UNIQUE-ID - CPD-17880\n', 'UNIQUE-ID - GLY\n', 'UNIQUE-ID - CPD-18298\n', 'UNIQUE-ID - D-SERINE\n', 'UNIQUE-ID - ACETYLCHOLINE\n',
              'UNIQUE-ID - DOPAMINE\n', 'UNIQUE-ID - SEROTONIN\n', 'UNIQUE-ID - HISTAMINE\n', 'UNIQUE-ID - PHENYLETHYLAMINE\n', 'UNIQUE-ID - TYRAMINE\n', 'UNIQUE-ID - CPD-58\n', 'UNIQUE-ID - 1-4-HYDROXYPHENYL-2-METHYLAMINOETHAN\n', 'UNIQUE-ID - TRYPTAMINE\n']
just_compound_id = []
for line in final_list:
    line = line.rstrip()
    split_id = line.split('-', 2)[2].strip()
    just_compound_id.append(split_id)
print(just_compound_id)

答案 3 :(得分:-1)

您的输出错误,因为每个输入字符串的第三个字符是I

你试图把角色放在中间,对吧?

所以请改用此代码

# obtain just compound ids from list
just_compound_id = []
for line in final_list:
    split_file = line.rstrip()
    split_file = line.split(' ')
    just_compound_id.append(split_file[len(split_file)/2]) # EDIT HERE

print(just_compound_id)