我有一个名为dfs的字典,其中包含数据帧:
team_id player_id x_loc y_loc radius game_clock shot_clock \
1 -1 -1 27.91690 41.37191 4.18103 710.78 11.71
2 -1 -1 31.90677 36.18951 3.47588 710.30 11.44
3 -1 -1 34.13352 27.62760 1.17149 709.82 11.16
4 -1 -1 34.74723 23.90685 3.42091 709.34 10.88
5 -1 -1 24.68878 15.18316 5.02066 708.86 10.60
6 -1 -1 17.59483 9.16468 3.03803 708.38 10.32
7 -1 -1 18.69309 12.53733 2.22372 707.90 10.04
8 -1 -1 16.23927 17.82597 5.45565 707.42 9.77
9 -1 -1 9.84219 8.62434 8.59493 706.94 9.49
10 -1 -1 5.73599 3.83553 4.77459 706.46 9.21
11 -1 -1 5.49103 3.97060 4.82267 705.98 8.93
12 -1 -1 2.44574 3.85045 0.84340 705.50 8.65
13 -1 -1 30.44487 43.11858 7.48128 713.02 13.01
quarter game_id event_id GAME_ID EVENTMSGTYPE PLAYER1_TEAM_ID
1 1 21500492 1 21500492 NaN 1.610613e+09
2 1 21500492 1 21500492 NaN 1.610613e+09
3 1 21500492 1 21500492 NaN 1.610613e+09
4 1 21500492 1 21500492 NaN 1.610613e+09
5 1 21500492 1 21500492 NaN 1.610613e+09
6 1 21500492 1 21500492 NaN 1.610613e+09
7 1 21500492 1 21500492 NaN 1.610613e+09
8 1 21500492 1 21500492 NaN 1.610613e+09
9 1 21500492 1 21500492 NaN 1.610613e+09
10 1 21500492 1 21500492 NaN 1.610613e+09
11 1 21500492 1 21500492 NaN 1.610613e+09
12 1 21500492 1 21500492 NaN 1.610613e+09
13 1 21500492 2 21500492 2.0 1.610613e+09
我想在EVENTMSGTYPE列中找到不包含值[3,5,6,7,8,9,10,11,12,13]的那些并将它们存储在新词典中,但是我似乎无法找到办法。
答案 0 :(得分:1)
我认为您需要字典理解并使用boolean indexing
按isin
进行过滤,~
用于反转布尔值掩码:
vals = [3, 5, 6, 7, 8, 9, 10, 11, 12, 13]
d1 = {k:df[~df['EVENTMSGTYPE'].isin(vals)] for k, df in dfs.items()}
或使用query
:
d1 = {k:df.query('EVENTMSGTYPE not in @vals') for k, df in dfs.items()}
要过滤掉空数据框,请使用:
d1 = {k:df[~df['EVENTMSGTYPE'].isin(vals)] for k, df in dfs.items()
if not df['EVENTMSGTYPE'].isin(vals).all()}
编辑:
d1 = {}
last = 0
for k,df in dfs.items():
m = ~df['EVENTMSGTYPE'].isin(vals)
m = m & m.all()
if m.all():
d1[last] = df
last += 1