我有这个数据框
我要在其上对原点+目的地和t_type的唯一对运行函数。 之前我只在t_type上运行该函数,所以我这样做了:
open_data = data[data['t_type']=="Open"].reset_index(drop=True)
all_data = data[data['t_type'] == "All"].reset_index(drop=True)
open = loader(open_data,open_trucks) #open_trucks and all_trucks are
all = loader(all_data,all_trucks) # from another dataframe
我从中检索出唯一的配对:
data.groupby(['Origin','Destination']).size().reset_index()
输出:
Origin Destination 0
Delhi Doon 7
Delhi Gurgaon 1
Delhi Mumbai 8
.
.
.
如何基于O + D从数据框中提取数据?很抱歉,如果我要重复此操作,但是这里的数据隔离是两次。.一次在O + D上,然后在t_type上。
我在想这个伪代码
for unique_pair in pairs:
open_data = something(which I don't know how to extract)
all_data = something(ditto)
run the function and store the output
数据:
t_type Origin Destination
0 Open Doon Gurgaon
1 Open Doon Gurgaon
2 Open Doon Gurgaon
3 Container Delhi Mumbai
4 Container Delhi Mumbai
5 Open Doon Mumbai
6 Open Delhi Mumbai
7 Open Delhi Mumbai
8 Open Delhi Mumbai
9 All Delhi Doon
10 All Delhi Doon
11 All Delhi Doon
12 All Delhi Doon
13 All Doon Gurgaon
14 All Doon Mumbai
15 Open Doon Gurgaon
16 Container Delhi Gurgaon
17 All Delhi Mumbai
18 All Delhi Mumbai
19 Container Delhi Doon
20 Container Delhi Doon
21 Container Delhi Doon
22 Open Delhi Mumbai
23 Container Doon Delhi
24 Container Doon Delhi
25 Container Doon Delhi
26 Container Doon Delhi
27 Container Doon Gurgaon
答案 0 :(得分:1)
我相信您需要:
for i, df in data.groupby(['Origin','Destination']):
#if need processing by 3 columns
#for i, df in data.groupby(['t_type', 'Origin', 'Destination']):
print (df)
或使用自定义功能:
def func(df):
print (df)
#processing per groups
return df
df1 = data.groupby(['Origin','Destination']).apply(func)