您好我正在尝试将此功能转换为pandas,因为我不熟悉R
sum(data_file$finished_race_date >= 0, na.rm = TRUE)/sum(data_file$signup_race_date >= 0, na.rm = TRUE)
我想弄清楚比赛中有多少比赛的选手
答案 0 :(得分:1)
如果需要将True
值除以notnull
比较的2个布尔值掩码:
100 * data_file.finished_race_date.notnull().sum()/data_file.signup_race_date.notnull().sum()
样品:
import pandas as pd
import numpy as np
data_file = pd.DataFrame({'finished_race_date':['2/5/16',np.nan,np.nan],
'signup_race_date':[np.nan,'2/5/16','2/5/16']})
print (data_file)
finished_race_date signup_race_date
0 2/5/16 NaN
1 NaN 2/5/16
2 NaN 2/5/16
print (data_file.finished_race_date.notnull())
0 True
1 False
2 False
Name: finished_race_date, dtype: bool
print (data_file.finished_race_date.notnull().sum())
1
finished_race_date = data_file.finished_race_date.notnull().sum()
signup_race_date = data_file.signup_race_date.notnull().sum()
print (100 * finished_race_date / signup_race_date)
50.0