将函数应用于pandas数据框中的特定列;在相同的数据框中将原始列替换为输出列

时间:2018-09-25 19:11:55

标签: python python-3.x pandas dataframe bigdata

我有一个csv文件,我将其读入熊猫框架:

import pandas as pd


csv_file = pd.read_csv('hello.csv', engine='c', delimiter=',', index_col=0,
                       skiprows=1, header=[0, 1])

这是csv文件(print(csv_file))的视图:

bodyparts        nose                  ...        right_ear              
coords              x           y      ...                y    likelihood
0          197.486369    4.545954      ...       206.351233  1.280000e-06
1          319.946460  191.035224      ...       206.321893  9.680000e-07
2          319.880388  191.012984      ...       206.322207  9.520000e-07
3          320.286005  190.843329      ...       206.227396  1.020000e-06
4          320.210989  190.863304      ...         3.106570  8.350000e-07
5          320.212529  190.867178      ...         3.116692  8.460000e-07
6           -0.794705    2.462400      ...         3.112797  8.500000e-07
7           -0.785404    2.485562      ...         3.117945  8.430000e-07
8          319.786777  191.003882      ...         3.125062  8.820000e-07
9          319.947064  191.030201      ...       206.202980  9.210000e-07
10         319.845807  191.002510      ...       206.177779  8.660000e-07
11         320.135816  190.967408      ...       206.190732  8.910000e-07
12          -0.935765    2.568168      ...       206.260773  8.860000e-07
13          -0.932833    2.525062      ...       206.273504  8.780000e-07
14          -0.960939    2.500079      ...       206.272811  8.680000e-07
15          -0.832561    2.442907      ...       206.266416  8.720000e-07
16          -0.838884    2.421689      ...       206.242941  9.440000e-07
17          -0.857173    2.421467      ...       206.243972  9.950000e-07
18          -0.841627    2.414854      ...       206.225004  9.820000e-07
...               ...         ...      ...              ...           ...
10459      349.556703  301.995042      ...       307.018688  9.999745e-01
10460      348.608277  301.098244      ...       309.648986  9.999962e-01
10461      349.995217  303.397438      ...       311.149967  9.999974e-01
10462      349.109666  305.710711      ...       311.893106  9.999955e-01
10463      352.142571  310.081763      ...       317.420410  9.907742e-01
10464      351.916488  317.078128      ...       319.407211  2.706501e-01
10465      353.809847  320.086683      ...       323.478481  9.911720e-01
10466      349.233529  321.859424      ...       323.383276  8.724346e-01

生成的数据帧具有两个级别的MultiIndexed:

tuple(('body_part1', 'body_part2', ..., 'body_partn'), ('x', 'y', 'likelihood')

我想对“ y”键下的每一行应用一个函数,并使用该函数的输出重新打包数据框。我该怎么做?

功能用途:该功能是使用y_max值对y值进行归一化和/或反转。

侧面问题

为什么这样做:

csv_file.groupby('y', axis=1, level=1)

返回:

KeyError: 'y'

0 个答案:

没有答案