我有以下pandas dataFrame。这是一个超过500k行的大型数据帧。
Event_Number Well p_and_s
0 1 7 4.0
1 1 9 0.0
2 1 15 0.0
3 2 7 2.0
4 2 9 7.0
5 2 15 0.0
6 3 5 0.0
7 3 7 8.0
8 3 16 3.0
9 4 7 8.0
10 4 15 0.0
11 5 7 8.0
12 5 9 3.0
13 5 15 6.0
14 6 5 0.0
15 6 7 8.0
16 7 7 8.0
17 7 9 0.0
18 7 15 0.0
19 8 7 8.0
20 8 15 4.0
我想为每一组[column:Event_Number]找到[column:Well]列[p_and_s]中的值大于2的值。
最终的dataFrame看起来像这样,新列列出p_and_s大于2
Event_Number Well p_and_s well_array
0 1 7 4.0 [7]
1 1 9 0.0 [7]
2 1 15 0.0 [7]
3 2 7 2.0 [9]
4 2 9 7.0 [9]
5 2 15 0.0 [9]
6 3 5 0.0 [7, 16]
7 3 7 8.0 [7, 16]
8 3 16 3.0 [7, 16]
9 4 7 8.0 [7]
10 4 15 0.0 [7]
11 5 7 8.0 [7, 9, 15]
12 5 9 3.0 [7, 9, 15]
13 5 15 6.0 [7, 9, 15]
14 6 5 0.0 [7]
15 6 7 8.0 [7]
16 7 7 8.0 [7]
17 7 9 0.0 [7]
18 7 15 0.0 [7]
19 8 7 8.0 [7, 15]
20 8 15 4.0 [7, 15]
答案 0 :(得分:2)
这是一种方式。
s = df[df['p_and_s'] > 2].groupby('Event_Number')['Well'].apply(list)
df['well_array'] = df['Event_Number'].map(s)
<强>解释强>
Event_Number
上应用过滤器后,创建一系列映射Well
到p_and_s
。pd.Series.map
映射到原始数据框。lambda
函数,因为它们代表昂贵的隐式循环。<强>结果强>
Event_Number Well p_and_s well_array
0 1 7 4.0 [7]
1 1 9 0.0 [7]
2 1 15 0.0 [7]
3 2 7 2.0 [9]
4 2 9 7.0 [9]
5 2 15 0.0 [9]
6 3 5 0.0 [7, 16]
7 3 7 8.0 [7, 16]
8 3 16 3.0 [7, 16]
9 4 7 8.0 [7]
10 4 15 0.0 [7]
11 5 7 8.0 [7, 9, 15]
12 5 9 3.0 [7, 9, 15]
13 5 15 6.0 [7, 9, 15]
14 6 5 0.0 [7]
15 6 7 8.0 [7]
16 7 7 8.0 [7]
17 7 9 0.0 [7]
18 7 15 0.0 [7]
19 8 7 8.0 [7, 15]
20 8 15 4.0 [7, 15]
答案 1 :(得分:1)
你可以试试这个:
well_array Event_Number Well p_and_s
0 [7] 1 7 4.0
1 [7] 1 9 0.0
2 [7] 1 15 0.0
3 [9] 2 7 2.0
4 [9] 2 9 7.0
5 [9] 2 15 0.0
6 [7, 16] 3 5 0.0
7 [7, 16] 3 7 8.0
8 [7, 16] 3 16 3.0
9 [7] 4 7 8.0
10 [7] 4 15 0.0
11 [7, 9, 15] 5 7 8.0
12 [7, 9, 15] 5 9 3.0
13 [7, 9, 15] 5 15 6.0
14 [7] 6 5 0.0
15 [7] 6 7 8.0
16 [7] 7 7 8.0
17 [7] 7 9 0.0
18 [7] 7 15 0.0
19 [7, 15] 8 7 8.0
20 [7, 15] 8 15 4.0
输出:
<Route path="/home" component={Home} exact={true}/>
<Route path="/admin/forgot-password" component={AdminResetPassword} exact={true}/>
<Authentication path="/admin/welcome" component={AdminWelcome} exact={true}/>
<Authentication path="/admin/dashboard" component={AdminDashboard} exact={true}/>