我需要根据列创建单独的文件。
我从源#1获取数据。
然后将数据发送到源#2,但源#2仅识别列2的代码
我可以获取数据并替换代码。
testdata.csv
1|b|430418886
1|f|434324988
1|c|445454512
1|f|430418574
1|a|432343445
1|d|437657654
1|e|424328828
1|a|430236546
1|e|434565445
1|c|430418988
1|d|430420012
1|b|476556568
codelist.csv
a|171
b|172
c|173
d|174
e|176
f|177
我可以创建完整列表,但我无法根据代码分隔文件 文件组看起来像这样。
171.csv
1|171|432343445
1|171|430236546
172.csv
1|172|430418886
1|172|476556568
173.csv
1|173|445454512
1|173|430418988
174.csv
1|174|437657654
1|174|430420012
176.csv
1|176|424328828
1|176|434565445
177.csv
1|177|434324988
1|177|430418574
到目前为止,我的代码是创建完整列表的。
def get_site_code_dict(site_code_file):
mydict = dict()
with open(site_code_file) as inputs:
for line in inputs:
name, code = line.strip().split("|")
mydict[name] = code
return mydict
def process_raw(raw_file, site_code_dict):
with open(raw_file) as inputs, open('ouput.csv', 'w') as outlist:
for line in inputs:
active, code, idnumber = line.strip().split("|")
outlist.write("1"+'|')
outlist.write(site_code_dict[code]+'|')
outlist.write(idnumber+'\n')
outlist.close()
if __name__ == "__main__":
site_code_dict = get_site_code_dict("codelist.csv")
process_raw("testdata.csv", site_code_dict)
输出:
1|172|430418886
1|177|434324988
1|173|445454512
1|177|430418574
1|171|432343445
1|174|437657654
1|176|424328828
1|171|430236546
1|176|434565445
1|173|430418988
1|174|430420012
1|172|476556568
我正在考虑创建第二个脚本来获取最终列表,然后将其分开 但一切都是最好的。
答案 0 :(得分:0)
这是一个常见的模式,可以通过字典和两个for循环来解决。当通过公共属性将元素分组在一起时,可以使用此模式,在本例中为code
代码背后的想法是:
1)创建一个可以按代码对所有内容进行分组的字典
2)遍历所有记录并将信息添加到字典
3)遍历最终字典的所有键并按代码输出信息
def get_site_code_dict(site_code_file):
mydict = dict()
with open(site_code_file) as inputs:
for line in inputs:
name, code = line.strip().split("|")
mydict[name] = code
return mydict
def process_raw(raw_file, site_code_dict):
code_to_instances = {} # set up out empty mapping/dictionary
with open(raw_file) as inputs:
for line in inputs:
active, letter, idnumber = line.strip().split("|")
code = site_code_dict[letter]
if code not in code_to_instances: # if the code hasn't yet been added to the dict
code_to_instances[code] = [] # Create an entry with a blank list of instances
code_to_instances[code].append({ 'active': active, 'id': idnumber }) # Add the element to the list
for code in code_to_instances.keys(): # for each code
with open(code + '.csv', 'w') as outlist: # open a file with a named based on the code
for instance in code_to_instances[code]: # for each instance
outlist.write(instance['active'] +'|') # write the instance information per line
outlist.write(code +'|')
outlist.write(instance['id'] +'\n')
if __name__ == "__main__":
site_code_dict = get_site_code_dict("codelist.csv")
process_raw("testdata.csv", site_code_dict)