概述
希望在写入另一个csv文件之前从2列csv文件中提取姓名,日期和地址等各种信息
条件
从EXCEL查看的CSV虚拟源数据参考文件格式
ID,DATA
88888,DADDY
88888,2/06/2016
88888,new issac road
99999,MUMMY
99999,samsung road
99999,12/02/2016
所需的CSV结果
ID,Name,Address,DATE
8888,DADDY,new issac road,2/06/2016
9999,MUMMY,samsung road,12/02/2016
到目前为止我有什么:
import csv
from collections import defaultdict
columns = defaultdict(list) # each value in each column is appended to a list
with open('dummy_data.csv') as f:
reader = csv.DictReader(f) # read rows into a dictionary format
for row in reader: # read a row as {column1: value1, column2: value2,...}
for (k,v) in row.items(): # go over each column name and value
columns[k].append(v) # append the value into the appropriate list
# based on column name k
uniqueidstatement = columns['receipt_id']
print uniqueidstatement
resultFile = open("wtf.csv",'wb')
wr = csv.writer(resultFile, dialect='excel')
wr.writerow(uniqueidstatement)
答案 0 :(得分:0)
您可以按ID
对这些部分进行分组,然后从每个组中,您可以使用一些简单的逻辑确定哪个是日期,哪个是地址。
import csv
from itertools import groupby
from operator import itemgetter
with open("test.csv") as f, open("out.csv", "w") as out:
reader = csv.reader(f)
next(reader)
writer = csv.writer(out)
writer.writerow(["ID","NAME","ADDRESS", "DATE"])
groups = groupby(csv.reader(f), key=itemgetter(0))
for k, v in groups:
id_, name = next(v)
add_date_1, add_date_2 = next(v)[1], next(v)[1]
date, add = (add_date_1, add_date_2) if "road" in add_date_2 else (add_date_2, add_date_1)
writer.writerow([id_, name, add, date])