Question

我们说我有两个数据帧：

第一个是大型列表（2400+值）：

101  102  103  104   [index value]
"A"  "B"  "C"  "D"   [another string] 
"1"  "1"  "1"  "1"   [another string] 
"2"  "2"  "2"  "2"   [another string]

然后是我想从第一个数据集中删除的第二个不合格值数据框，但可能有一些未包含在第一个数据框中的值：

101 104 205  [index value]
"A" "D" "Q"  [another string] 
"1" "1" "2"  [another string] 
"2" "2" "1"  [another string]

我如何将这两者（那些匹配的）结合起来并从第一个数据帧中删除它们？在这个例子中，我想最终得到：

102  103   [index value]
"B"  "C"   [another string] 
"1"  "1"   [another string] 
"2"  "2"   [another string]

Answer 1

假设您有一个包含此索引的某个index_column的df，以及一个具有相似名称列的取消资格（dsq）数据框：

dsq = df_dsq['index_column'].to_list()
df_clean= df.loc[~df['index column'].isin(dsq), :].copy()

减去两个数据框的并集

1 个答案: