我需要通过将x和y除以宽度和高度来归一化数据。
X /宽度和y /高度
输入DataFrame示例:
ID X1 Y1 X2 Y2 X3 Y3 X4 Y4 X5 Y5 Width Height
1 1 2 1 2 1 2 1 2 1 2 2 10
2 1 2 1 2 1 2 1 2 1 2 2 10
输出数据框
ID X1n Y1n X2n Y2n X3n Y3n X4n Y4n X5n Y5n
1 .5 .2 .5 .2 .5 .2 .5 .2 .5 .2
2 .5 .2 .5 .2 .5 .2 .5 .2 .5 .2
答案 0 :(得分:5)
df.update(df.filter(like='X').div(df['Width'],0))
df.update(df.filter(like='Y').div(df['Height'],0))
df = df.drop(columns=['Width','Height']).add_suffix('n').rename(columns={'IDn':'ID'})
df
输出:
ID X1n Y1n X2n Y2n X3n Y3n X4n Y4n X5n Y5n
0 1 0.5 0.2 0.5 0.2 0.5 0.2 0.5 0.2 0.5 0.2
1 2 0.5 0.2 0.5 0.2 0.5 0.2 0.5 0.2 0.5 0.2
答案 1 :(得分:2)
如果您缺少宽度或高度测量值,更新将是有问题的,因为它不会用NaN
覆盖原始文件。
wcols = df.columns[df.columns.str.contains('X')]
hcols = df.columns[df.columns.str.contains('Y')]
df.loc[:, wcols] = df.loc[:, wcols].divide(df.Width, axis=0)
df.loc[:, hcols] = df.loc[:, hcols].divide(df.Height, axis=0)
df = df.drop(columns=['Width', 'Height'])
# Doesn't mess up IDs name
df.columns = [f'{col}n' if col != 'ID' else col for col in df.columns]
出局:
ID X1n Y1n X2n Y2n X3n Y3n X4n Y4n X5n Y5n
0 1 0.5 0.2 0.5 0.2 0.5 0.2 0.5 0.2 0.5 0.2
1 2 0.5 0.2 0.5 0.2 0.5 0.2 0.5 0.2 0.5 0.2
答案 2 :(得分:0)
当您拥有X1,X2,X123等列时,
pd.concat([df.filter(regex='^X\d+').div(df.loc[0,'Width']),
df.filter(regex='^Y\d+').div(df.loc[0,'Height']) ],axis=1)\
.rename(columns=lambda x: x+'n')\
.assign(Height = df.Height.values[0])\
.assign(Width= df.Width.values[0])\
.assign(ID = df.ID)
礼物:
X1n X2n X3n X4n X5n Y1n Y2n Y3n Y4n Y5n Height Width ID
0 0.5 0.5 0.5 0.5 0.5 0.2 0.2 0.2 0.2 0.2 10 2 1
1 0.5 0.5 0.5 0.5 0.5 0.2 0.2 0.2 0.2 0.2 10 2 2
答案 3 :(得分:0)
另一种可能的方法是在应用filter()和pandas.concat()之后压缩列:
dfX= df.filter(regex= 'X').div(df['Width'],axis =0)
dfY =df.filter(regex= 'Y').div(df['Height'],axis =0)
df= pd.concat([dfX,dfY], axis = 1)
df_final = df[[col for elem in list(zip(dfX.columns, dfY.columns)) for col in elem]].add_suffix('n')