大家好,我有一个数据框,其中的列像这样。 列:
在此数据框内,每一行都是具有所有这些列属性的单个观察值。 我的任务是计算变量 P ,然后对x回归p,对y回归p,最后对(x和y)回归p P =(值为y且丢失的移动次数)/(值为y的移动总数)
我的问题是为我的小组找到 P 。我不确定如何以pythonic方式处理此问题,我可以手动循环并计算所有计数,但即使那样,我仍不确定如何处理它,并且由于数据帧的大小,这可能需要很长的时间
WhiteR,BlackR,EMV,MovePlayedValue,NewGame,NinePtLead,AverageRating,Rating_Group,length_of_checkmate
1880.0,1865.0,27.0,27.0,1,useless,1875,1800,0
1880.0,1865.0,22.0,21.0,1,useless,1875,1800,0
1865.0,1880.0,25.0,25.0,1,useless,1875,1800,0
1880.0,1865.0,24.0,19.0,1,useless,1875,1800,0
1865.0,1880.0,22.0,22.0,1,useless,1875,1800,0
1880.0,1865.0,27.0,27.0,1,bigLeadLost,1875,1800,2
答案 0 :(得分:0)
如果我正确理解了您的问题:您希望导致输的y类型的频率(非零类型),除以y的总移动量(y类型):
import pandas as pd
import numpy as np
df = {'WhiteR': [1880.0,1880.0,1865.0,1880.0,1865.0,1880.0],\
'BlackR': [1865.0,1865.0,1880.0,1865.0,1880.0,1865.0],\
'EMV': [27.0,22.0,25.0,24.0,22.0,27.0,],\
'MovePlayedValue':[27.0,21.0,25.0,19.0,22.0,27.0,],\
'NewGame':[1,1,1,1,1,1],\
'NinePtLead':['useless','useless','useless','useless','useless','bigLeadLost'],\
'AverageRating':[1875,1875,1875,1875,1875,1875],\
'Rating_Group':[1800,1800,1800,1800,1800,1800,],\
'length_of_checkmate':[0,0,0,0,0,2]}
df = pd.DataFrame(df)
status=df['length_of_checkmate'].value_counts().reset_index().rename(columns={'index':
'length_of_checkmate', 'length_of_checkmate': 'Freq.'})
df1 = pd.merge(df, status, on = ('length_of_checkmate'))
df1['P']= (df1['Freq.']/df1['length_of_checkmate']).replace(np.inf, 0)
#then proceed to 'Regress p against x, regress p against y and finally p against (x and y)'