Question

我有一个DataFrame

In [3]: df
Out[3]:
                             Price  Size        Codes
2015-04-13 06:14:49-04:00  100.200   900     FT,R6,IS
2015-04-13 06:14:54-04:00  100.190   100     FT,R6,IS
2015-04-13 06:14:54-04:00  100.190   134     FT,R6,IS
2015-04-13 06:15:02-04:00  100.170   200     FT,R6,IS
...                            ...   ...          ...
[248974 rows x 3 columns]

和一个清单

exclude = ['R6', 'F2', 'IS']

如果exclude列中的某个df项位于Codes列下的In [4]: df.Codes.str.split(',') Out[4]: 2015-04-13 06:14:49-04:00 [FT, R6, IS] 2015-04-13 06:14:54-04:00 [FT, R6, IS] 2015-04-13 06:14:54-04:00 [FT, R6, IS] 2015-04-13 06:15:02-04:00 [FT, R6, IS] ... Name: Codes, Length: 248974行，我想过滤掉该行。

我发现我可以做到这一点

df[df.Codes.split(',') in exclude]

基本上我想要的是按android:style/Theme.NoTitleBar.Fullscreen或类似的方式查询。任何帮助非常感谢。

Answer 1

df['check'] = df['Codes'].apply(lambda code: 1 if [elt for elt in code.split(',') if elt in exclude] else 0)
df_filtered_out = df[df['check'] == 1]

以防万一：默认情况下，apply（）逐行工作（查看pandas docu以获取更多信息），如果some_list为空，则if some_list返回False，否则返回True。

Answer 2

# for the sake of performance, we turn the lookup list into a set
excludes = set(['R7', 'R5'])

ix = df.Codes.str.split(',').apply(lambda codes: not any(c in excludes for c in codes))
df[ix] # returns the filtered DataFrame

Pandas DataFrame列表中每行的比较

2 个答案: