Question

我有一个看起来像这样的文件：

cdo mergetime in1.nc in2.nc out.nc

我想创建一个字典，其中大写字母（=氨基酸序列）是键，有机体名称是值。到目前为止，我有：

    >Organism1
    ETTGDMND
    >Organism2
    PDELMESPEER
    >Organism3
    YERLLRRAQ
    >Organism1
    EDLTEVSGIGC

它会引发错误消息“＆＃39; key＆＃39;没有定义。但我认为这是通过说key = line ..？

使用相同输入文件的相关问题。如果我只想调用该文件中的氨基酸序列（出于其他目的），我做了：

    dict1 = {}
    for line in file.readlines():
        line = line.rstrip() 
        if ">" not in line:        # '>' not in the line=amino acid seq 
            key = line             #assign the line into a variable 'key' 
            dict1[key] = []        #make this variable the keys of dict1
        else:                      #if '>'is in the line = organism
            value = line
            dict1[key] = value  
    print dict1

但它只打印了一个序列而不是所有序列。谁能帮我？谢谢！

Answer 1

由于您的价值总是出现在您的钥匙之前，因此直截了当的方法是“记住＆＃34;获取密钥时可以使用的另一个变量中的值。因此，以下内容应该有效：

dict1 = {}
file = open("somedata.dat")
for line in file:  # note you can leave out readlines() here
    line = line.rstrip()
    if line[0] == ">":    # safer to check just first char
        value = line[1:]  # use [1:] to drop the ">" from the value
    else:
        dict1[line] = value
print dict1

如果单个值后面有多行氨基酸键，则所有键都将使用相同的值。

对于你的第二个问题，问题是这一行：

my_sequences = [line]

始终替换my_sequences，无论其先前的值如何，因此您将获得包含最后处理序列的单项列表。替换为：

my_sequences.append(line)

将一个项目添加到列表的末尾，并且它会执行您想要的操作。

将一列文件存储到字典中

1 个答案: