我有像这样的pandas DataFrame;它显示了股票投资的历史。在利润列中,1表示有利可图,0表示亏损。
Stock Year Profit Count
AAPL 2012 0 23
AAPL 2012 1 19
AAPL 2013 0 20
AAPL 2013 1 10
GOOG 2012 0 26
GOOG 2012 1 20
GOOG 2013 0 23
GOOG 2013 1 11
我必须找出有利可图的投资百分比:
Stock Year Profit CountPercent
AAPL 2012 1 38.77
AAPL 2013 1 33.33
GOOG 2012 1 43.47
GOOG 2013 1 32.35
我尝试使用this post中的方法
但它显示'TypeError: Join on level between two MultiIndex objects is ambiguous'
。
答案 0 :(得分:2)
我已将您的数据加载到名为“stock”的数据框中。
# Get the count of profitable trades, indexed by stock+year:
count_profitable = stocks[ stocks['Profit']==1 ].set_index(['Stock','Year']).Count
# Get the count of all trades, indexed by stock + year:
count_all = stocks.groupby(['Stock','Year']).Count.sum()
# Render nice percentages
pandas.options.display.float_format = '{:.2f}%'.format
(count_profitable/count_all) * 100
这将产生:
Stock Year
AAPL 2012 45.24%
2013 33.33%
GOOG 2012 43.48%
2013 32.35%
Name: Count, dtype: float64
答案 1 :(得分:2)
您可以使用pivot_table:
In [38]: result = df.pivot_table(index=['Stock', 'Year'], columns='Profit', values='Count', aggfunc='sum')
In [39]: result['CountPercent'] = result[1]/(result[0]+result[1])
In [41]: result['CountPercent']
Out[41]:
Stock Year
AAPL 2012 0.452381
2013 0.333333
GOOG 2012 0.434783
2013 0.323529
Name: CountPercent, dtype: float64
答案 2 :(得分:1)
假设您的DataFrame格式一致(即'Profit'列中的0在1之前),您可以执行以下groupby
操作:
>>> grouped = df.groupby(['Stock', 'Year'])
>>> perc = grouped['Count'].last() / grouped['Count'].sum()
>>> perc.reset_index()
Stock Year Count
0 AAPL 2012 0.452381
1 AAPL 2013 0.333333
2 GOOG 2012 0.434783
3 GOOG 2013 0.323529
这只是一个普通的DataFrame,所以应该直接重命名'Count'列,将其四舍五入到小数点后两位并重新添加'Profit'列。