我正在玩股票数据,而且我试图过滤那些购买量超过卖出量的Transaction
值的群组
因此,运行以显示以下数据的代码是
df.groupby('Stock').Transaction.value_counts()
数据
Stock Transaction
ADC Buy 2
AKAM Option Exercise 51
Sale 34
Buy 9
AMNB Buy 10
ARCC Buy 15
ARL Buy 12
ASA Buy 7
ASRV Buy 12
Option Exercise 1
AUBN Buy 4
Sale 11
BAC Option Exercise 23
Buy 15
Sale 7
BCBP Buy 3
Sale 11
BKSC Buy 55
BMRA Buy 5
Option Exercise 3
Sale 1
..
我按照他们的股票代码对数据进行分组,然后查看各自的列Transaction
值。我试图过滤掉其交易value_counts多于Buy
而不是Sale
的群组。
我无法弄清楚如何做到这一点。
我试过这样的事情:
df.groupby('Stock').filter(lambda x: x.Transaction.value_counts().Buy > x.value_counts().Sale)
df.Transaction.value_counts().Buy
>>>2674
我也按照
的方式尝试了一些事情df.groupby('Stock').Transaction.filter(lambda x: x if x.value_counts().Buy > x.value_counts().Sale)
但我无法想到在这种情况下哪种大熊猫工具是理想的。
输出可以是任何东西,从满足这种条件的股票名称到打印出整个集团(股票名称和交易)
所以输出就像这样
ADC Buy 2
AMNB Buy 10
ARCC Buy 15
ARL Buy 12
ASA Buy 7
ASRV Buy 12
Option Exercise 1
BAC Option Exercise 23
Buy 15
Sale 7
BKSC Buy 55
BMRA Buy 5
Option Exercise 3
Sale 1
或者只是股票名称。
感谢。
答案 0 :(得分:1)
我unstack
然后query
d1 = df.groupby('Stock').Transaction.value_counts()
d1.unstack(fill_value=0).query('Buy > Sale')
我们可以用这个
来恢复它d1.unstack(fill_value=0).query('Buy > Sale') \
.replace(0, np.nan).stack().astype(int)
Stock Transaction
ADC Buy 2
AMNB Buy 10
ARCC Buy 15
ARL Buy 12
ASA Buy 7
ASRV Buy 12
Option Exercise 1
BAC Buy 15
Option Exercise 23
Sale 7
BKSC Buy 55
BMRA Buy 5
Option Exercise 3
Sale 1
dtype: int64