我有以下数据框
数据框代码:
df = pd.DataFrame({'Car Type': ['Compact']*9 + ['Economy'],
'Supplier':['Alamo','Enterprise','Budget','Nation', 'Avis','Payless','Payless','Payless','E-ZRent-a-Car','E-ZRent-a-Car'],
'Total Price':[74]*3+[78,79,84,35,37,43,43],
'Location':['Altanta']*10,
'Pick-up Date':['Jun/12/2019']*6+['Jun/13/2019']*4,
'Date Accessed':['06-11-2019']*10})
我需要创建一个数据框,其中包含“供应商”,“汽车类型”,“取车日期”,“访问日期”的唯一组合列表,以及竞争性报价和最佳竞争对手的数量价格可在“提货日期”前1-14天获得。
任何帮助将不胜感激。
答案 0 :(得分:0)
要获取您描述的列的唯一组合:
df.drop_duplicates(subset=["Supplier", "Car Type", "Pick-up Date", "Date Accessed"])
答案 1 :(得分:0)
IIUC,
您需要先过滤数据框,然后再按总价排序并删除重复项。
df[(df['Pick-up Date'] - df['Date Accessed']) < pd.Timedelta(days=14)]\
.sort_values('Total Price', ascending=False).drop_duplicates(['Car Type', 'Supplier',
'Pick-up Date', 'Date Accessed'])
输出:
Car Type Supplier Total Price Location Pick-up Date Date Accessed
5 Compact Payless 84 Altanta 2019-06-12 2019-06-11
4 Compact Avis 79 Altanta 2019-06-12 2019-06-11
3 Compact Nation 78 Altanta 2019-06-12 2019-06-11
0 Compact Alamo 74 Altanta 2019-06-12 2019-06-11
1 Compact Enterprise 74 Altanta 2019-06-12 2019-06-11
2 Compact Budget 74 Altanta 2019-06-12 2019-06-11
8 Compact E-ZRent-a-Car 43 Altanta 2019-06-13 2019-06-11
9 Economy E-ZRent-a-Car 43 Altanta 2019-06-13 2019-06-11
7 Compact Payless 37 Altanta 2019-06-13 2019-06-11