我使用Python csv
模块将带有多值字段的csv
转换为Python list
。输出包含具有多个相关值的字段。
['Route', 'Vehicles', 'Vehicle Class', 'Driver_ID', 'Date', 'Start', 'Arrive']
['ABC', 'ZYG098, AB0134, GF0158', 'A1, B2, C3', 'John Doe, Jane Doe, Abraham Lincoln', '20150301', 'A', 'B']
['AC', 'ZGA123', 'C3', 'George Washington', '20150301', 'A', 'C']
['ABC', 'XAZ012, AB0134, YZ089', 'C1, B2, A2 ', 'John Adams, Jane Doe, Thomas Jefferson', '20150302', 'A', 'B']
我想将车辆,车辆类和驾驶员ID字段转换为嵌套列表,这样如果我对车辆row[1]
中的每个子列表进行排序,以确保车辆始终按字母顺序显示在子列表中,车辆类和司机保持在相应的,正确的订单。所以标题和第一行子列表的排列方式如下:
['Route', 'Vehicles', 'Vehicle Class', 'Driver_ID', 'Date', 'Start', 'Arrive']
['ABC', 'AB0134, GF0158, ZYG098', 'B2, C3, A1', 'Jane Doe, Abraham Lincoln, John Doe', '20150301', 'A', 'B']
['AC', 'ZGA123', 'C3', 'George Washington', '20150301', 'A', 'C']
['ABC', 'AB0134, YZ089, XAZ012', 'B2, A2, C1', 'Jane Doe, Thomas Jefferson, John Adams', '20150302', 'A', 'B']
因此,在上面的输出中,车辆的每个子组/列表都按字母顺序排序,车辆类和Driver_ID会根据需要重新安排,以保持与各自车辆的原始关系(即驾驶员ID - John Doe驾驶车辆 - ZYG098是车辆类 - A1,所以那些物品在他们的子列表中移动以反映ZYG098现在是最后的,而不是第一个)。如果可以这样做,您将如何将生成的嵌套列表导出回原始标题的CSV?
道歉,如果这很简单或荒谬,我只是开始学习Python。如果嵌套列表不是最佳选项,我可以使用任何其他解决方案(对于字典,我需要连接字段来创建密钥,因为没有组合Route_Date的唯一密钥)。如果有人拥有使用Python处理各种CSV用例的可靠资源,那么推荐会很棒。
提前感谢您的耐心和帮助。
答案 0 :(得分:1)
最后在同一页上,它需要一些工作,但这将做你想要的:
from itertools import chain
import csv
l = [['Route', 'Vehicles', 'Vehicle Class', 'Driver_ID', 'Date', 'Start', 'Arrive'],
['ABC', 'ZYG098, AB0134, GF0158', 'A1, B2, C3', 'John Doe, Jane Doe, Abraham Lincoln', '20150301', 'A', 'B'],
['AC', 'ZGA123', 'C3', 'George Washington', '20150301', 'A', 'C'],
['ABC', 'XAZ012, AB0134, YZ089', 'C1, B2, A2 ', 'John Adams, Jane Doe, Thomas Jefferson', '20150302', 'A', 'B']]
it = map(list,zip(*l))
# transpose original list, row-columns, columns-rows
it = zip(*l)
# get each column separately, using iter so we can pop first element
# off to get headers efficiently
route, veh, veh_c, d_id, date, start, arrive = iter(iter(next(it))), iter(next(it)), iter(next(it)), iter(next(it)), iter(next(it)), iter(next(it)), iter(next(it))
# get all headers to write later
headers = next(route), next(veh), next(veh_c), next(d_id), next(date), next(start), next(arrive)
srt_veh = []
key_inds = []
# sort vehicle elements and keep a record of old indexes
# so subelements in Vehicle_class and driver_id can be rearranged to match
for x in veh:
srt = sorted(x.split(","))
key_inds.append([x.split(",").index(w) for w in srt])
srt_veh.append(",".join(srt).strip())
srt_veh_cls = []
# sort vehicle class based on old index of elements in vehicles
# and rejoin split elements
for ind, ele in enumerate(veh_c):
spl = ele.split(",")
srt_veh_cls.append(",".join([spl[i].strip() for i in key_inds[ind]]))
srt_dr_id = []
# sort driver_ids based on old index of elements in vehicle
# and join subelements again after splitting and sorting
for ind, ele in enumerate(d_id):
spl = ele.split(",")
srt_dr_id.append(",".join([spl[i].strip() for i in key_inds[ind]]))
# transpose again for writing
zipped = zip(*(route, srt_veh, srt_veh_cls,
srt_dr_id, date, start, arrive))
最后用csv.writerows写道:
with open("out.csv", "w") as f:
wr = csv.writer(f)
wr.writerow(headers)
wr.writerows(zipped)
输出:
Route,Vehicles,Vehicle Class,Driver_ID,Date,Start,Arrive
ABC,"AB0134, GF0158,ZYG098","B2,C3,A1","Jane Doe,Abraham Lincoln,John Doe",20150301,A,B
AC,ZGA123,C3,George Washington,20150301,A,C
ABC,"AB0134, YZ089,XAZ012","B2,A2,C1","Jane Doe,Thomas Jefferson,John Adams",20150302,A,B
对于python 2,用itertools.izip
替换zip并使用itertools.imap
映射:
from itertools import izip, imap
你可以拉链更多,并做一些事情来缩短代码,但我认为这无助于可读性。
答案 1 :(得分:0)
要转换为您描述的嵌套格式:
nested = zip(*lst)
拉链是它自己的反转:
orig = zip(*nested)
但也许你真正想要的是:
import operator
sort = sorted(lst[1:], key=operator.itemgetter(1))
它为您提供了按行1排序的新列表。在这种情况下,您还没有更改数据的格式,因此您应该能够将其作为csv转储回来而不进行修改,尽管您可以使用#c; d需要在lst [0]之前添加原始标题。