我有一个问题让我头晕目眩。 假设我有下一个数据帧:
df2 = pd.DataFrame(np.random.randint(0,3,size=(10, 4)),columns=['ONE', 'TWO', 'CARS', 'FOUR'])
df2['NAMES'] = ['Peter','Jon','Mary','Mary','Peter','Peter','BONIFACE','Michael','Lucy','Gilari']
df2['CARS'] = ['Mercedes','BMW','Ford','BMW','BMW','Dacia','Ford','Pontiac','Chevrolet','Tesla']
例如,我将它分为汽车。
agrupe = df2.groupby(['CARS'])
问题在于,一旦我对它进行分组,我就想用它进行操作,例如在BMW制造的组中,我想从第1列上有2的元素中将col 2的值分配给col 4。让我们看看如果我学会操作它:
g = agrupe.get_group('BMW')
从此开始
ONE TWO CARS FOUR NAMES
1 1 0 BMW 1 Jon
3 2 1 BMW 1 Mary
4 0 1 BMW 0 Peter
到此:
ONE TWO CARS FOUR NAMES
1 1 0 BMW 1 Jon
3 2 1 BMW 1 Mary
4 0 1 BMW 1 Peter
答案 0 :(得分:1)
您的自定义函数f
似乎需要groupby
:
np.random.seed(100)
df2 = pd.DataFrame(np.random.randint(0,3,size=(10, 4)),columns=['ONE', 'TWO', 'CARS', 'FOUR'])
df2['NAMES'] = ['Peter','Jon','Mary','Mary','Peter','Peter','BONIFACE','Michael','Lucy','Gilari']
df2['CARS'] = ['Mercedes','BMW','Ford','BMW','BMW','Dacia','Ford','Pontiac','Chevrolet','Tesla']
print (df2)
ONE TWO CARS FOUR NAMES
0 0 0 Mercedes 2 Peter
1 2 0 BMW 1 Jon
2 2 2 Ford 2 Mary
3 1 0 BMW 0 Mary
4 0 2 BMW 1 Peter
5 1 2 Dacia 0 Peter
6 0 1 Ford 1 BONIFACE
7 0 0 Pontiac 1 Michael
8 1 2 Chevrolet 2 Lucy
9 1 1 Tesla 2 Gilari
def f(x):
if (x.name == 'BMW'):
x.loc[x.ONE == 2, 'FOUR'] = x.TWO
return x
agrupe = df2.groupby('CARS').apply(f)
print (agrupe)
ONE TWO CARS FOUR NAMES
0 0 0 Mercedes 2 Peter
1 2 0 BMW 0 Jon
2 2 2 Ford 2 Mary
3 1 0 BMW 0 Mary
4 0 2 BMW 1 Peter
5 1 2 Dacia 0 Peter
6 0 1 Ford 1 BONIFACE
7 0 0 Pontiac 1 Michael
8 1 2 Chevrolet 2 Lucy
9 1 1 Tesla 2 Gilari
更好的解决方案是首先选择列CARS
为BMW
且列ONE
为2
的所有行,然后按列{{1}更改FOUR
}:
TWO
如果需要更改df2.loc[(df2.CARS == 'BMW') & (df2.ONE == 2), 'FOUR'] = df2.TWO
print (df2)
ONE TWO CARS FOUR NAMES
0 0 0 Mercedes 2 Peter
1 2 0 BMW 0 Jon
2 2 2 Ford 2 Mary
3 1 0 BMW 0 Mary
4 0 2 BMW 1 Peter
5 1 2 Dacia 0 Peter
6 0 1 Ford 1 BONIFACE
7 0 0 Pontiac 1 Michael
8 1 2 Chevrolet 2 Lucy
9 1 1 Tesla 2 Gilari
列中的2
,请按列ONE
更改列FOUR
:
TWO