我正在使用python预处理csv文件的任务。任务是读取original.csv文件,其中包含每行一次交易中购买的项目。我需要创建另一个csv文件,该文件应类似于target.csv文件,该文件将在每行一次事务中显示所有数据。
original.csv(每行列出了所购买的任何单个交易项目)
样本数据看起来像original.csv文件的内容
Transaction Id Items
1 Eggs
1 Yogurt
1 Apple
2 Tea
2 Rice
在target.csv中,一笔交易下的所有数据都列在一行下。因此,行1包含了交易号为1的所有已购买商品。
样本数据看起来像target.csv文件的内容
1 Eggs,Yogurt,Apple
2 Tea Rice
我正在使用下面的python代码将original.csv转换为target.csv
#reading csv file
newdf = pd.read_csv('original.csv',header=None)
#creating an empty list
basket = []
'''
this code will enumerate through all transaction number in original csv file
and create a list with items belonging to one transaction. After for loop
execution gets completed list 'basket' will contain list of list containing
items of all transactions
'''
for id in newdf[1].unique():
a = [newdf[2][i] for i, j in enumerate(newdf[1]) if j == id]
basket.append(a)
'''
Now a new csv file is created with all items of one transaction in one
row getting written in this file.
'''
with open('target.csv','w',newline='') as writeFile:
writer = csv.writer(writeFile)
writer.writerows(basket)
writeFile.close()
输出的csv文件中的事务编号17
17 milk hand soap pasta individual meals spaghetti sauce cereals sandwich loaves hand soap
原始csv文件中的交易编号17
Transaction Id Items
17 Yogurt
17 Milk
17 hand soap
17 pasta
17 individual meals
...
对于事务编号17,target.csv文件中缺少一项“酸奶”。我还检查了其他一些交易号,发现缺少物品。
如何在新的csv文件的一行中显示一笔交易的所有数据?