Question

这是我正在工作的示例数据框：

df:

a      b     c
a1  P1,P3  abc
a2  P2,P4  def
a3  P2     ghi   `

我想在一个数据框中的多个列上应用过滤器，这些列中有逗号分隔的值。

过滤器数据采用以下称为series的{{1}}形式：

df_filters

过滤器数据在第一列中包含数据帧df_filters: a [a1] b [P1, P4]`作为column name，在第二列中包含string作为filter values。

使用上面的list过滤数据帧df_filters并获得以下结果：

df1

结论：Result1: a b c a1 P1,P3 abc a2 P2,P4 def `中的列a仅考虑值df1的行，列a1仅考虑包含值b的行，并且P1。在P4列的第1行中，b和P1是两个不同的值，以逗号分隔。

反正我可以为P3实现以上Result吗？

有关类似情况的参考，请查看以下链接： Apply a list of filters to a dataframe coming from a list using pandas

Answer 1

使用：

byte[] sig = privKey.SignData(Encoding.ASCII.GetBytes(signingString), CryptoConfig.MapNameToOID("SHA256"), RSASignaturePadding.Pkcs1);

Answer 2

对于每个值，您要检查它是否存在于相应的df_filters列表中。由于该列可以包含列表或单个项目，因此也需要对此进行检查。
由于这种情况有点复杂，因此我将此逻辑移到了单独的函数_filter_func上。

def _filter_func(x, f_vals_set):
    if not isinstance(x, list):
        # This is needed becouse values in dataframe could be single object or a list of objects
        x = [x]
    # Check if the there is any matching value in filter set
    matching_vals = f_vals_set.intersection(x)
    return len(matching_vals) > 0

conditions = [df[col].apply(lambda x: _filter_func(x, set(f_vals))) for col, f_vals in filters.items()]
df.loc[pd.np.logical_or.reduce(conditions)]

在熊猫的数据框中的多个列中过滤逗号分隔的值

2 个答案: