使用pandas groupby功能查找可盈利投资的百分比

时间:2014-11-21 10:01:16

标签: python pandas dataframe

我有像这样的pandas DataFrame;它显示了股票投资的历史。在利润列中,1表示有利可图,0表示亏损。

Stock  Year   Profit  Count
 AAPL  2012    0       23
 AAPL  2012    1       19
 AAPL  2013    0       20
 AAPL  2013    1       10
GOOG   2012    0       26
GOOG   2012    1       20
GOOG   2013    0       23
GOOG   2013    1       11

我必须找出有利可图的投资百分比:

Stock  Year   Profit  CountPercent
 AAPL  2012    1       38.77
 AAPL  2013    1       33.33
GOOG   2012    1       43.47
GOOG   2013    1       32.35

我尝试使用this post中的方法  但它显示'TypeError: Join on level between two MultiIndex objects is ambiguous'

3 个答案:

答案 0 :(得分:2)

我已将您的数据加载到名为“stock”的数据框中。

# Get the count of profitable trades, indexed by stock+year:
count_profitable = stocks[ stocks['Profit']==1 ].set_index(['Stock','Year']).Count
# Get the count of all trades, indexed by stock + year:
count_all        = stocks.groupby(['Stock','Year']).Count.sum()
# Render nice percentages
pandas.options.display.float_format = '{:.2f}%'.format 
(count_profitable/count_all) * 100

这将产生:

Stock  Year
AAPL   2012   45.24%
       2013   33.33%
GOOG   2012   43.48%
       2013   32.35%
Name: Count, dtype: float64

答案 1 :(得分:2)

您可以使用pivot_table

In [38]: result = df.pivot_table(index=['Stock', 'Year'], columns='Profit', values='Count', aggfunc='sum')

In [39]: result['CountPercent'] = result[1]/(result[0]+result[1])

In [41]: result['CountPercent']
Out[41]: 
Stock  Year
AAPL   2012    0.452381
       2013    0.333333
GOOG   2012    0.434783
       2013    0.323529
Name: CountPercent, dtype: float64

答案 2 :(得分:1)

假设您的DataFrame格式一致(即'Profit'列中的0在1之前),您可以执行以下groupby操作:

>>> grouped = df.groupby(['Stock', 'Year'])
>>> perc = grouped['Count'].last() / grouped['Count'].sum()
>>> perc.reset_index()
  Stock  Year     Count
0  AAPL  2012  0.452381
1  AAPL  2013  0.333333
2  GOOG  2012  0.434783
3  GOOG  2013  0.323529

这只是一个普通的DataFrame,所以应该直接重命名'Count'列,将其四舍五入到小数点后两位并重新添加'Profit'列。