如何在两列的唯一对上运行循环

时间:2019-04-30 07:10:48

标签: python python-3.x pandas dataframe

我有这个数据框

数据: enter image description here

我要在其上对原点+目的地和t_type的唯一对运行函数。 之前我只在t_type上运行该函数,所以我这样做了:

open_data = data[data['t_type']=="Open"].reset_index(drop=True)
all_data = data[data['t_type'] == "All"].reset_index(drop=True)
open = loader(open_data,open_trucks)    #open_trucks and all_trucks are
all = loader(all_data,all_trucks)       # from another dataframe

我从中检索出唯一的配对:

data.groupby(['Origin','Destination']).size().reset_index()

输出:

Origin  Destination   0
Delhi   Doon          7
Delhi   Gurgaon       1
Delhi   Mumbai        8
.
.
.

如何基于O + D从数据框中提取数据?很抱歉,如果我要重复此操作,但是这里的数据隔离是两次。.一次在O + D上,然后在t_type上。

我在想这个伪代码

for unique_pair in pairs:
    open_data = something(which I don't know how to extract)
    all_data = something(ditto)
    run the function and store the output

数据:

          t_type Origin Destination
0        Open   Doon     Gurgaon
1        Open   Doon     Gurgaon
2        Open   Doon     Gurgaon
3   Container  Delhi      Mumbai
4   Container  Delhi      Mumbai
5        Open   Doon      Mumbai
6        Open  Delhi      Mumbai
7        Open  Delhi      Mumbai
8        Open  Delhi      Mumbai
9         All  Delhi        Doon
10        All  Delhi        Doon
11        All  Delhi        Doon
12        All  Delhi        Doon
13        All   Doon     Gurgaon
14        All   Doon      Mumbai
15       Open   Doon     Gurgaon
16  Container  Delhi     Gurgaon
17        All  Delhi      Mumbai
18        All  Delhi      Mumbai
19  Container  Delhi        Doon
20  Container  Delhi        Doon
21  Container  Delhi        Doon
22       Open  Delhi      Mumbai
23  Container   Doon       Delhi
24  Container   Doon       Delhi
25  Container   Doon       Delhi
26  Container   Doon       Delhi
27  Container   Doon     Gurgaon

1 个答案:

答案 0 :(得分:1)

我相信您需要:

for i, df in data.groupby(['Origin','Destination']):
#if need processing by 3 columns 
#for i, df in data.groupby(['t_type', 'Origin', 'Destination']):
    print (df)

或使用自定义功能:

def func(df):
    print (df)
    #processing per groups

    return df

df1 = data.groupby(['Origin','Destination']).apply(func)