我有2个不同的硬币翻转数据帧。我想创建一个找到两件事的函数:
是否可以使n列的函数动态化?
import pandas as pd
import numpy as np
df=pd.DataFrame({'Users': [ 'Bob', 'Jim', 'Ted', 'Jesus', 'James'],
'Round 1': ['np.nan','H','np.nan','T','H'],
'Round 2': ['np.nan','H','H','H','T'],
'Round 3': ['np.nan','T','T','T','T'],
})
df2=pd.DataFrame({'Users': [ 'Boob', 'Paul', 'Todd', 'Zeus', 'Derrik'],
'Round 1': ['H','H','np.nan','T','np.nan'],
'Round 3': ['H','T','H','T','np.nan'],
'Round 5': ['H','T','H','T','np.nan'],
'Round 7': ['H','H','H','H','H'],
})
df = df.set_index('Users')
df2 = df2.set_index('Users')
print (df)
print (df2)
以下是我的尝试:
def score(data):
score_map = {'H':1, 'T':0}
data=data.replace(score_map)
data['average']=
data['rounds played']=
df=score(df)
我猜我必须使用groupby,如果可能的话
结果应如下所示:
Round 1 Round 2 Round 3 Average Rounds played
Users
Bob np.nan np.nan np.nan NaN 0
Jim 1 1 0 0.66 3
Ted np.nan 1 0 0.5 2
Jesus 0 1 0 0.33 3
James 1 0 0 0.33 2
[5 rows x 3 columns]
答案 0 :(得分:1)
In [104]: def score_map(x):
.....: if x=='H': return 1
.....: if x=='T': return 0
.....: return np.nan
.....:
In [105]: def score(data):
.....: return_df = data.applymap(score_map)
.....: avg = return_df.mean(axis=1)
.....: nrounds = return_df.count(axis=1)
.....: return_df['Average'] = avg
.....: return_df['Rounds Played']=nrounds
.....: return return_df
.....:
In [106]: score(df)
Out[106]:
Round 1 Round 2 Round 3 Average Rounds Played
Users
Bob NaN NaN NaN NaN 0
Jim 1 1 0 0.666667 3
Ted NaN 1 0 0.500000 2
Jesus 0 1 0 0.333333 3
James 1 0 0 0.333333 3
[5 rows x 5 columns]