我有以下数据框:
In [4]:
df
Out[4]:
Symbol Date Strike C/P Bid Ask
0 GS 6/15/2015 200 c 5 72
1 GS 6/15/2015 200 p 5 72
2 GS 6/15/2015 210 c 15 0
3 GS 6/15/2015 210 p 15 54
4 GS 7/15/2015 200 c 20 50
5 GS 7/15/2015 200 p 20 0
6 GS 7/15/2015 210 c 4 90
7 GS 7/15/2015 210 p 4 90
8 IBM 6/15/2015 150 c 12 27
9 IBM 6/15/2015 150 p 12 0
10 IBM 6/15/2015 160 c 1 58
11 IBM 6/15/2015 160 p 1 3
12 IBM 7/15/2015 120 c 13 39
13 IBM 7/15/2015 120 p 13 39
14 IBM 7/15/2015 130 c 4 45
15 IBM 7/15/2015 130 p 4 45
如果其中任何一个的问题值为0,则希望过滤掉给定警示的c和p:
Symbol Date Strike Call/Put Bid Ask yminx
GS 6/15/2015 200 c 5 72 90
GS 6/15/2015 200 p 5 72 90
GS 7/15/2015 210 c 4 90 90
GS 7/15/2015 210 p 4 90 90
IBM 6/15/2015 160 c 1 58 58
IBM 6/15/2015 160 p 1 3 58
IBM 7/15/2015 120 c 13 39 58
IBM 7/15/2015 120 p 13 39 58
IBM 7/15/2015 130 c 4 45 58
IBM 7/15/2015 130 p 4 45 58
我可以通过询问为0进行过滤,并通过执行以下操作删除该行:
df = df[df.Ask != 0]
但我无法弄清楚如何删除具有相同符号/日期/警示组合但非零问题的另一行。
任何帮助将不胜感激。
答案 0 :(得分:2)
>>> mask = df.groupby(['Symbol', 'Date', 'Strike'])['Ask'].transform('all')
>>> df[~mask]
Symbol Date Strike C/P Bid Ask
2 GS 6/15/2015 210 c 15 0
3 GS 6/15/2015 210 p 15 54
4 GS 7/15/2015 200 c 20 50
5 GS 7/15/2015 200 p 20 0
8 IBM 6/15/2015 150 c 12 27
9 IBM 6/15/2015 150 p 12 0
所以要删除这些行,请df[mask]
。
答案 1 :(得分:1)
要过滤掉某些行,我们需要使用'过滤器'功能而不是' apply'。
by = df.groupby(['Symbol', 'Date', 'Strike'])
# this is used as filter function, returns a boolean type selector.
# pandas.groupby.filter() function would be smart enough to keep all those
# entry with True
def equal_to_45(group):
# return True if either Call or Put has an Ask = 45
return any(group.Ask.values == 45)
def keep_geq_45(group):
# return True if both Call or Put have an Ask great or equal to 45
# that is equivalent to delete all entries with Ask less than 45
return all(group.Ask.values >= 45)
# this time, use filter function instead of apply
by.filter(equal_to_45)
Out[242]:
Symbol Date Strike C/P Bid Ask
14 IBM 2015-07-15 130 c 4 45
15 IBM 2015-07-15 130 p 4 45
by.filter(keep_geq_45)
Out[243]:
Symbol Date Strike C/P Bid Ask
0 GS 2015-06-15 200 c 5 72
1 GS 2015-06-15 200 p 5 72
6 GS 2015-07-15 210 c 4 90
7 GS 2015-07-15 210 p 4 90
14 IBM 2015-07-15 130 c 4 45
15 IBM 2015-07-15 130 p 4 45