我有一个'Person'
,其中填充了来自不同科目的XY坐标。我想创建一个新列,从这些主题中获取指定的XY坐标。
当import pandas as pd
import numpy as np
import random
AA = 10, 20
k = 5
N = 10
df = pd.DataFrame({
'John Doe_X' : np.random.uniform(k, k + 100 , size=N),
'John Doe_Y' : np.random.uniform(k, k + 100 , size=N),
'Kevin Lee_X' : np.random.uniform(k, k + 100 , size=N),
'Kevin Lee_Y' : np.random.uniform(k, k + 100 , size=N),
'Liam Smith_X' : np.random.uniform(k, k + -100 , size=N),
'Liam Smith_Y' : np.random.uniform(k, k + 100 , size=N),
'Event' : ['AA', 'nan', 'BB', 'nan', 'nan', 'CC', 'nan','CC', 'DD','nan'],
'Person' : ['nan','nan','John Doe','John Doe','nan','Kevin Lee','nan','Liam Smith','John Doe','John Doe']})
df['X'] = df.apply(lambda row: row.get(row['Person']+'_X') if pd.notnull(row['Person']) else np.nan, axis=1)
df['Y'] = df.apply(lambda row: row.get(row['Person']+'_Y') if pd.notnull(row['Person']) else np.nan, axis=1)
列中突出显示任何主题的名称时,即可实现此目的。这将返回该索引处该主题的XY坐标。
Event John Doe_X John Doe_Y Kevin Lee_X Kevin Lee_Y Liam Smith_X \
0 AA 75.047164 19.281168 28.064313 87.184248 -76.148559
1 nan 50.642782 68.308319 46.088057 64.132263 -83.109383
2 BB 9.965115 77.950894 48.864693 8.613132 0.106708
3 nan 44.726136 58.751520 69.904076 40.818433 -87.656064
4 nan 101.501119 99.156872 101.976300 93.539749 -57.026015
5 CC 87.778446 65.814911 7.302116 40.577156 -28.703879
6 nan 99.682139 91.715231 88.029451 82.309191 -66.444582
7 CC 38.248267 38.648960 76.065297 67.322639 -34.754868
8 DD 69.429353 61.252800 83.024358 58.038962 -62.001353
9 nan 9.522023 73.009883 41.873986 8.677565 -20.389939
Liam Smith_Y Person X Y
0 18.420494 nan NaN NaN
1 33.206289 nan NaN NaN
2 73.833204 John Doe 9.965115 77.950894
3 39.652071 John Doe 44.726136 58.751520
4 88.176561 nan NaN NaN
5 53.776995 Kevin Lee 7.302116 40.577156
6 95.025923 nan NaN NaN
7 26.851864 Liam Smith -34.754868 26.851864
8 102.771046 John Doe 69.429353 61.252800
9 28.633231 John Doe 9.522023 73.009883
输出:
'Event'
我现在希望使用['X','Y']
列来优化新的AA (10,20)
列。具体来说,当值'AA'
位于'Event'
列时,我想返回 Event John Doe_X John Doe_Y Kevin Lee_X Kevin Lee_Y Liam Smith_X \
0 AA 75.047164 19.281168 28.064313 87.184248 -76.148559
1 nan 50.642782 68.308319 46.088057 64.132263 -83.109383
2 BB 9.965115 77.950894 48.864693 8.613132 0.106708
3 nan 44.726136 58.751520 69.904076 40.818433 -87.656064
4 nan 101.501119 99.156872 101.976300 93.539749 -57.026015
5 CC 87.778446 65.814911 7.302116 40.577156 -28.703879
6 nan 99.682139 91.715231 88.029451 82.309191 -66.444582
7 CC 38.248267 38.648960 76.065297 67.322639 -34.754868
8 DD 69.429353 61.252800 83.024358 58.038962 -62.001353
9 nan 9.522023 73.009883 41.873986 8.677565 -20.389939
Liam Smith_Y Person X Y
0 18.420494 nan 10 20
1 33.206289 nan 10 20
2 73.833204 John Doe 9.965115 77.950894
3 39.652071 John Doe 44.726136 58.751520
4 88.176561 nan NaN NaN
5 53.776995 Kevin Lee 7.302116 40.577156
6 95.025923 nan NaN NaN
7 26.851864 Liam Smith -34.754868 26.851864
8 102.771046 John Doe 69.429353 61.252800
9 28.633231 John Doe 9.522023 73.009883
的坐标。此外,我喜欢获得相同的坐标,直到下一个坐标出现。
所以输出看起来像是:
for value in df['Event']:
if value == 'AA' :
df['X', 'Y'] = AA
我试着写这样的东西:
ValueError: Length of values does not match length of index
但是得到一个ValueError:merge
答案 0 :(得分:0)
您的代码有一些错误(Person与其他东西错误)。我认为这是一个粘贴错误。
然而,使用蒙版并将元组AA应用于蒙版使用的子集df.loc
m = df['Event'] == 'AA'
df.loc[m, ['X','Y']] = AA
答案 1 :(得分:0)
如果要遍历行,可以尝试:
# iterate through rows
for index, row in df.iterrows():
# check Event value for the row
if row['Event'] == 'AA' :
# update dataframe
df.loc[index,('X', 'Y')] = AA
print(df)
结果:
Event John Doe_X John Doe_Y Kevin Lee_X Kevin Lee_Y Liam Smith_X \
0 AA 12.603084 81.636376 25.997186 76.733337 -17.683132
1 nan 104.652839 104.064767 56.762357 83.599629 -34.714117
2 BB 69.724434 33.324135 98.452840 57.407782 -8.479175
3 nan 16.361719 51.290716 41.929234 46.494053 -81.882100
4 nan 30.874579 34.683986 95.434111 80.343098 -62.448286
5 CC 77.619875 70.164773 7.385376 40.142712 -55.590472
6 nan 31.214066 54.081010 36.249414 34.218611 -21.754019
7 CC 91.487647 28.307019 71.235864 48.915612 -37.196812
8 DD 45.036216 61.655465 50.231592 29.511502 -4.583804
9 nan 95.249002 25.649100 31.959114 10.234085 -93.106746
X NaN NaN NaN NaN NaN NaN
Liam Smith_Y Person X Y
0 86.267909 nan 10.000000 20.000000
1 43.090388 nan NaN NaN
2 56.330139 John Doe 69.724434 33.324135
3 65.648633 John Doe 16.361719 51.290716
4 16.349304 nan NaN NaN
5 5.528887 Kevin Lee 7.385376 40.142712
6 75.717007 nan NaN NaN
7 100.925457 Liam Smith -37.196812 100.925457
8 87.256541 John Doe 45.036216 61.655465
9 35.361163 John Doe 95.249002 25.649100
X NaN NaN NaN NaN