从一个csv列创建多个文件。 Python 2.7.12

时间:2017-01-19 01:37:53

标签: python-2.7

我需要根据列创建单独的文件。

我从源#1获取数据。
然后将数据发送到源#2,但源#2仅识别列2的代码 我可以获取数据并替换代码。

testdata.csv

1|b|430418886
1|f|434324988
1|c|445454512
1|f|430418574
1|a|432343445
1|d|437657654
1|e|424328828
1|a|430236546
1|e|434565445
1|c|430418988
1|d|430420012
1|b|476556568

codelist.csv

a|171
b|172
c|173
d|174
e|176
f|177

我可以创建完整列表,但我无法根据代码分隔文件 文件组看起来像这样。

171.csv
1|171|432343445
1|171|430236546

172.csv
1|172|430418886
1|172|476556568

173.csv
1|173|445454512
1|173|430418988

174.csv
1|174|437657654
1|174|430420012

176.csv
1|176|424328828
1|176|434565445

177.csv
1|177|434324988
1|177|430418574

到目前为止,我的代码是创建完整列表的。

def get_site_code_dict(site_code_file):
    mydict = dict()
    with open(site_code_file) as inputs:
        for line in inputs:
            name, code = line.strip().split("|")
            mydict[name] = code
    return mydict

def process_raw(raw_file, site_code_dict):
    with open(raw_file) as inputs, open('ouput.csv', 'w') as outlist:
        for line in inputs:
            active, code, idnumber = line.strip().split("|")
            outlist.write("1"+'|')
            outlist.write(site_code_dict[code]+'|')
            outlist.write(idnumber+'\n')
    outlist.close()


if __name__ == "__main__":
    site_code_dict = get_site_code_dict("codelist.csv")
    process_raw("testdata.csv", site_code_dict)

输出:

1|172|430418886
1|177|434324988
1|173|445454512
1|177|430418574
1|171|432343445
1|174|437657654
1|176|424328828
1|171|430236546
1|176|434565445
1|173|430418988
1|174|430420012
1|172|476556568

我正在考虑创建第二个脚本来获取最终列表,然后将其分开 但一切都是最好的。

1 个答案:

答案 0 :(得分:0)

这是一个常见的模式,可以通过字典和两个for循环来解决。当通过公共属性将元素分组在一起时,可以使用此模式,在本例中为code

代码背后的想法是:

1)创建一个可以按代码对所有内容进行分组的字典

2)遍历所有记录并将信息添加到字典

3)遍历最终字典的所有键并按代码输出信息

def get_site_code_dict(site_code_file):
    mydict = dict()
    with open(site_code_file) as inputs:
        for line in inputs:
            name, code = line.strip().split("|")
            mydict[name] = code
    return mydict

def process_raw(raw_file, site_code_dict):
    code_to_instances = {} # set up out empty mapping/dictionary
    with open(raw_file) as inputs:
        for line in inputs:
            active, letter, idnumber = line.strip().split("|")
            code = site_code_dict[letter]
            if code not in code_to_instances: # if the code hasn't yet been added to the dict
                code_to_instances[code] = [] # Create an entry with a blank list of instances
            code_to_instances[code].append({ 'active': active, 'id': idnumber }) # Add the element to the list

    for code in code_to_instances.keys(): # for each code
        with open(code + '.csv', 'w') as outlist: # open a file with a named based on the code
            for instance in code_to_instances[code]: # for each instance
                outlist.write(instance['active'] +'|') # write the instance information per line
                outlist.write(code +'|')
                outlist.write(instance['id'] +'\n')     

if __name__ == "__main__":
    site_code_dict = get_site_code_dict("codelist.csv")
    process_raw("testdata.csv", site_code_dict)