Question

因此，我有一个DataFrames字典，由于不知道如何直接在字典中进行处理，因此我现在将其拆分为单个DataFrames。它们是像这样的测试样品的结果

T01
mm    N    Cycle
a     1      1
b     2      1
c     3      2
d     4      2
e     5      3
...   ...    ...

现在，我制作了另一个DataFrame（也尝试使用列表）：

Cycles
1
3
5
...

我的目标是过滤掉

的每一行

Cycle != Cycles

所以我得到一个看起来像这样的列表：

mm    N    Cycle
a     1      1
b     2      1
e     5      3
f     6      3
...   ...    ...

我已经通过以下代码创建了DataFrame：

for k,v in data_dict.items():    
    globals()[k] = (v[['mm','N', 'Unnamed: 3']])
    globals()[k].columns = ['mm', 'N', 'Cycles']

现在我有7个不同的DataFrame，其大小从（2570，3）到（12402，3）不等。

我一直在四处寻找，但似乎找不到可行的方法。也许我不应该使用globals（）[k]调用吗？我是python的新手，所以我缺乏很多知识。预先谢谢你

编辑：我一直在使用

之类的布尔操作进行尝试

 globals()[k].query("Cycle" != Cycles['Cycles'])

或

 globals()[k][globals()[k].Cycle != Cycles['Cycles']]

但这对我没有任何帮助。

Answer 1

假设您有一个要过滤的循环列表：

cycle_list = [1, 2, 3]

现在给定一个数据帧字典data_dict，您可以使用带有布尔掩码的字典理解来过滤满足条件的行：

res = {k: v[v['Cycles'].isin(cycle_list)] for k, v in data_dict.items()}

这假设您的每个数据框都有一个Cycles系列。 pd.Series.isin返回一系列布尔值，用于索引您的数据框。

此外，请注意，使用globals()的原因非常少；您应该避免在可能的情况下致电globals。

Answer 2

因此，您有一个DataFrames字典，我们将其称为data_dict和一个周期Cycles的列表。
您说您不知道如何处理数据帧字典，但是一旦知道如何处理一个数据帧，那实际上就是最简单的部分，因为您只需要遍历它。要解决您的问题，只需使用isin方法：

res = {}
for id, df in data_dict.items():
    #loop over the dataframes in the dict
    res[id] = df[df.Cycle.isin(Cycles)]
    #this is a new dataframe where you have only the cycles in Cycles

Answer 3

像这样吗？

说，您的数据框定义为：

test = pd.DataFrame（{'mm'：['1'，'b'，'c'，'d'，'e']，'N'：[1,2,3,4,5 ]，'Cycle'：[1,1,2,2,3]}）

test2 = pd.DataFrame（{'Cycles'：[1,3,5]}）

要过滤出列：

test3 = test [test.Cycle.isin（test2.Cycles）]

出[19]：循环N mm 0 1 1 1 1 1 2 b 4 3 5 e

Python-通过与另一个df / list的列比较过滤DataFrame

3 个答案: