Question

在Pandas中，我曾经能够获取数据帧列，将其与第二个数据帧列进行比较，并从第二列中获取所有项目，如下所示：

notYetIncluded = notYetIncluded.loc[~notYetIncluded["ID"].isin(df_o["ID"])]

然而，在更新的pandas中不再有效（我收到错误ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long long'）。我该怎么做？

似乎导致破损的部分是：notYetIncluded["ID"].isin(df_o["ID"])

我不知道它是否有帮助，但这些列目前仅存储4150，5808等数字。它们的长度均为4位或更短。

例如：

notYetIncluded： 0 5747 1 5746 2 5725 3 5722 4 5720 5 5707 Name: ID, dtype: object

df_o：24 5365 4 5720 15 5599 Name: ID, dtype: int64

Answer 1

使用df.astype(str)将列转换为字符串，然后进行比较。

n = notYetIncluded
notYetIncluded = n[~n["ID"].astype(str).isin(df_o["ID"].astype(str))]