Question

我需要遍历一个数据框，以识别与键值对不匹配的行。所有键都是不同的，但值可能会重复。

当我使用循环外的代码时，它将生成一个空的数据框。我已经确认应该包含一些行（数据框中存在键和值）。

一些示例代码：

df = pd.DataFrame(np.random.randint(5,10,size=(15, 4)), columns=list('ABCD'))
dept = {7: [5], 8: [7], 5: [9, 10], 9: [9]}

nd = pd.DataFrame()
for key, value in dept.items():
    f = df.loc[df['A']==key, :]
    ff = f.loc[~f['B'].isin(value), :]
    print(type(ff))
    print(ff.shape)
    nd.append(ff)
print(nd)

我收到以下输出：

<class 'pandas.core.frame.DataFrame'>
(4, 4)
<class 'pandas.core.frame.DataFrame'>
(1, 4)
<class 'pandas.core.frame.DataFrame'>
(2, 4)
<class 'pandas.core.frame.DataFrame'>
(1, 4)
Empty DataFrame
Columns: []
Index: []

由于形状准确，因此我相信这与类型有关。如何从这种类型提取数据框？

我已经在堆栈溢出方面进行了高低搜索，但是没有找到这种类型的示例。感谢您的帮助！

Answer 1

尝试concat：

nd = pd.DataFrame()
for key, value in dept.items():
    f = df.loc[df['A']==key, :]
    ff = f.loc[~f['B'].isin(value), :]
    print(type(ff))
    print(ff.shape)

    frames = [nd, ff]
    nd = pd.concat(frames)
print(nd)

Answer 2

另一种选择是使用series.map()，然后使用df.query()

进行过滤

(df.assign(E=df['A'].map(dept)).dropna(subset=['E']).explode('E').query("B!=E")
                                               .drop("E",1).drop_duplicates())

    A  B  C  D
2   9  6  7  6
5   8  9  7  5
6   8  5  7  5
8   9  8  9  8
10  7  7  7  5
11  5  5  8  7
12  9  7  8  8
13  9  6  8  8
14  9  6  9  9

空数据框，使用字典过滤数据框

2 个答案: