Python pandas创建一个函数来计算n列行的平均值

时间:2014-05-11 01:34:50

标签: python pandas

我有2个不同的硬币翻转数据帧。我想创建一个找到两件事的函数:

  • Heads = 1且tails = 0
  • 的平均得分(超出100%)
  • 他们为获得该分数而玩的游戏数量

是否可以使n列的函数动态化?

import pandas as pd
import numpy as np

df=pd.DataFrame({'Users': [ 'Bob', 'Jim', 'Ted', 'Jesus', 'James'],
                 'Round 1': ['np.nan','H','np.nan','T','H'],
                 'Round 2': ['np.nan','H','H','H','T'],
                 'Round 3': ['np.nan','T','T','T','T'],
                 })

df2=pd.DataFrame({'Users': [ 'Boob', 'Paul', 'Todd', 'Zeus', 'Derrik'],
                 'Round 1': ['H','H','np.nan','T','np.nan'],
                 'Round 3': ['H','T','H','T','np.nan'],
                 'Round 5': ['H','T','H','T','np.nan'],
                 'Round 7': ['H','H','H','H','H'],
                 })

df = df.set_index('Users')
df2 = df2.set_index('Users')
print (df)
print (df2)

以下是我的尝试:

def score(data):
    score_map = {'H':1, 'T':0}
    data=data.replace(score_map)
    data['average']=
    data['rounds played']=

df=score(df)

我猜我必须使用groupby,如果可能的话

结果应如下所示:

      Round 1 Round 2 Round 3  Average   Rounds played
Users                        
Bob    np.nan  np.nan  np.nan   NaN      0
Jim         1       1       0   0.66     3
Ted    np.nan       1       0   0.5      2
Jesus       0       1       0   0.33     3 
James       1       0       0   0.33     2

[5 rows x 3 columns]

1 个答案:

答案 0 :(得分:1)

In [104]: def score_map(x):
   .....:         if x=='H': return 1
   .....:         if x=='T': return 0
   .....:         return np.nan
   .....: 

In [105]: def score(data):
   .....:         return_df = data.applymap(score_map)
   .....:         avg = return_df.mean(axis=1)
   .....:         nrounds = return_df.count(axis=1)
   .....:         return_df['Average'] = avg
   .....:         return_df['Rounds Played']=nrounds
   .....:         return return_df
   .....: 

In [106]: score(df)
Out[106]: 
       Round 1  Round 2  Round 3   Average  Rounds Played
Users                                                    
Bob        NaN      NaN      NaN       NaN              0
Jim          1        1        0  0.666667              3
Ted        NaN        1        0  0.500000              2
Jesus        0        1        0  0.333333              3
James        1        0        0  0.333333              3

[5 rows x 5 columns]