在熊猫中添加计算的行

时间:2020-06-11 16:52:04

标签: python pandas data-analysis

gender math score   reading score   writing score           
female    65                73           74
male      69                66           64

鉴于数据框(请参见上文),我们如何添加一行以以下方式计算行值之间的差异:

gender      math score    reading score   writing score         
female         65               73            74
male           69               66            64
Difference     -3                7            10

还是有一种更方便的方式来表示行之间的差异?

提前谢谢

3 个答案:

答案 0 :(得分:1)

在带有.loc[].diff()的单排中:

df.loc['Difference'] = df.diff(-1).dropna().values.tolist()[0]

另一个想法是处理转置的数据帧,然后将其转回:

import pandas as pd
df = pd.DataFrame({'gender':['male','female'],'math score':[65,69],'reading score':[73,66],'writing score':[74,64]}).set_index('gender')
df = df.T
df['Difference'] = df.diff(axis=1)['female'].values
df = df.T

输出:

            math score  reading score  writing score
gender                                              
male              65.0           73.0           74.0
female            69.0           66.0           64.0
Difference         4.0           -7.0          -10.0

答案 1 :(得分:0)

您可以通过选择每一行然后减去来计算差异。但是,正如您已经正确地猜到的那样,这并不是最好的方法。一种更方便的方法是将df转置然后进行减法:

import pandas as pd

df = pd.DataFrame([[65, 73, 74], [69, 66, 64]], 
                  index=['female', 'male'], 
                  columns=['math score', 'reading score', 'writing score'])

df_ = df.T

df_['Difference'] = df_['female'] - df_['male']

这就是你得到的:

               female  male  Difference
math score         65    69          -4
reading score      73    66           7
writing score      74    64          10

如果需要,可以再次转置df_.T,以恢复为初始格式。

答案 2 :(得分:0)

让-

df = pd.DataFrame({"A":[5, 10], "B":[9, 8], "gender": ["female", "male"]}).set_index("gender")
df.loc['Difference'] = df.apply(lambda x: x["female"]-x["male"])