Question

我有一个pandas数据帧，我需要根据值切换到几个表并保存到几个.csv文件中。这个方法似乎有效，但它创建了一个我不能删除的列（第一个）（而是删除了第二列）。有人能告诉我为什么它在那里，我怎么能摆脱它？谢谢。这是代码：

new_d['Supplier'] = new_d.apply(lambda row: determine_supplier(row), axis = 1)
new_d.sort_values(by = ['Supplier'], inplace = True)
new_d.set_index(keys = ['Supplier'], drop = False, inplace = True)
suppliers = new_d['Supplier'].unique().tolist()
for supplier in suppliers:
  po = new_d.loc[new_d.Supplier == supplier] #the problem is here?
  po = po.drop(po.columns[[0]], axis = 1) # can't drop
  po.to_csv(path_or_buf = r'PO\\' + supplier + '_PO.csv')

Answer 1

DataFrame中的第一列称为index。

to_csv中需要参数index=False才能省略它：

po.to_csv(path_or_buf = r'PO\\'+ supplier+'_PO.csv',index=False)

或更好：

相反：

for supplier in suppliers:
  po = new_d.loc[new_d.Supplier == supplier] #the problem is here?
  po = po.drop(po.columns[[0]], axis = 1) # can't drop
  po.to_csv(path_or_buf = r'PO\\' + supplier + '_PO.csv')

将groupby用于looping：

for supplier, po in new_d.groupby('Supplier'):
    po.to_csv(r'PO\\'+ supplier +'_PO.csv',index=False)

不能放弃pandas专栏

1 个答案: