比较列表和字典并将其打印到YML文件

时间:2018-10-24 10:01:15

标签: python-3.x list dictionary yaml

我有列表和字典,现在我需要将字典中的键与列表的键进行比较,并将匹配项打印到YML文件中。 这是我的字典键,<​​/ p>

samples1.keys()
dict_keys(['C3N-02289_10_L1', 'C3N-02289_10_L2', 'C3N-02671_08_L1', 'C3N-02671_08_L2','C3N-02671_10_L1','C3N-02671_10_L1' ])

我有2个列表,

Left_reads = [
    'C3N-02289_10_L1_R1.gz',
    'C3N-02289_10_L2_R1.gz',
    'C3N-02671_08_L1_R1.gz',
    'C3N-02671_08_L2_R1.gz',
    'C3N-02671_10_L1_R1.gz',
    'C3N-02671_10_L2_R1.gz'
 ]



 Right_reads = [
    'C3N-02289_10_L1_R2.gz',
    'C3N-02289_10_L2_R2.gz',
    'C3N-02671_08_L1_R2.gz',
    'C3N-02671_08_L2_R2.gz',
    'C3N-02671_10_L1_R2.gz',
    'C3N-02671_10_L2_R2.gz'
]

现在,我需要将字典(samples1.key)中的每个键与列表(Left_readsRight_reads)中的值进行比较。如果键与列表中的字符串匹配,则将其打印到YML文件中。 这是我尝试过的,

for sam in samples1.keys():
    ymlFile = pat + sam + '.yml'
    ymlFH = open(ymlFile, 'w')

    ymlFH.write("reads1: [\n")
    for sam in sorted(Left_reads):
        ymlFH.write(" {class: File, path: " + path + '/' + sam + "}, \n")
    ymlFH.write("]\n")

    ymlFH.write("reads2: [\n")    
    for sam in sorted(Right_reads):
        ymlFH.write(" {class: File, path: " + path + '/' + sam + "}, \n")
    ymlFH.write("]\n")
    ymlFH.close()

这会打印出列表Left_readsRight_readsreads1reads2的所有值。

我的目标是输出一个单独打印列表中每个值的输出,例如对于文件C3N-02289_10_L1.yml,输出应该像这样,因此这里所有值都与C3N-02289_10_L1匹配和C 3N-02289_10_L2。我需要脚本来比较列表和字典之间的前缀C3N-02289_10,然后将其写入YML文件

reads1: [
 {class: File, path: /usr/path/C3N-02289_10_L1_R1.gz}, 
 {class: File, path: /usr/path/C3N-02289_10_L2_R1.gz},

]
reads2: [
 {class: File, path: /usr/path/C3N-02289_10_L1_R2.gz}, 
 {class: File, path: /usr/path/C3N-02289_10_L2_R2.gz},

]

对于字典中的所有值,列表中的匹配字符串等都是相同的。 通过上面的代码,C3N-02289_10_L1.yml的输出看起来像是

 reads1: [
     {class: File, path: /usr/path/C3N-02289_10_L1_R1.gz}, 
     {class: File, path: /usr/path/C3N-02289_10_L2_R1.gz},
    {class: File, path: /usr/path/C3N-02671_08_L1_R1.gz}, 
    {class: File, path: /usr/path/C3N-02671_08_L2_R.gz},
    {class: File, path: /usr/path/C3N-02671_10_L1_R1.gz},
    {class: File, path: /usr/path/C3N-02671_10_L2_R1.gz} , 
    ]
    reads2: [
     {class: File, path: /usr/path/C3N-02289_10_L1_R2.gz}, 
     {class: File, path: /usr/path/C3N-02289_10_L2_R2.gz},
    {class: File, path: /usr/path/C3N-02671_08_L1_R2.gz}, 
    {class: File, path: /usr/path/C3N-02671_08_L2_R2.gz},
    {class: File, path: /usr/path/C3N-02671_10_L1_R2.gz},
    {class: File, path: /usr/path/C3N-02671_10_L2_R2.gz} 
    ]

1 个答案:

答案 0 :(得分:1)

首先让我们从您的目标开始。

在您的代码中,字典中的键与列表中的值之间没有比较。据我了解,您想检查当前的字典键是否为列表中值的前缀,如果是,则将该文件名转储到.yaml文件中。

因此您的代码应类似于:

for prefix in samples1.keys():
    for filename in some_list:
        if filename.startswith(prefix):
            # add the {class: File, path: some/path/filename } to the yaml file

第二,您的代码输出不是有效的Yaml文件。 我建议使用PyYaml软件包。

如果将所有内容放在一起,我们将得到:

import yaml

# definition of path variable is here somewhere...

# edited to take only the prefixes of the keys
desired_keys = ['_'.join(k.split('_')[:-1]) for k in samples1.keys()]

for prefix in desired_keys:
    yml_filename = prefix + '.yaml'
    reads1 = []
    for filename in Left_reads:
        if filename.startswith(prefix):
            reads1.append({'class': 'File', 'path': path + '/' + filename})

    reads2 = []
    for filename in Right_reads:
        if filename.startswith(prefix):
            reads2.append({'class': 'File', 'path': path + '/' + filename})

    data = {'reads1': reads1, 'reads2': reads2 }
    stream = open(yml_filename, 'w')
    yaml.dump(data, stream)
    stream.close()

在旁注中,我建议使用os.path.join(path, filename)方法而不是path + '/' + filename,只是为了减少出错的可能性。

编辑

使用给定的Left_readsRight_readssamples1.keys(),结果是三个.yml文件:

C3N-02289_10.yml C3N-02671_08.yml C3N-02671_10.yml

第一个,即C3N-02289_10.yml包含:

reads1:
- {class: File, path: /path/yamlTest/__main__/C3N-02289_10_L1_R1.gz}
- {class: File, path: /path/yamlTest/__main__/C3N-02289_10_L2_R1.gz}
reads2:
- {class: File, path: /path/yamlTest/__main__/C3N-02289_10_L1_R2.gz}
- {class: File, path: /path/yamlTest/__main__/C3N-02289_10_L2_R2.gz}