Question

我有一个带有200个索引的DataFrame。我想删除属于其他索引的所有行，但不包括属于某些索引的行，例如128、133、140、143、199。

以前，我删除了属于索引128、133、140、143、199的所有行，并且运行良好。我的代码是

dataset_drop = dataset.drop(index = [128, 133, 140, 143, 199])

现在，我正在尝试相反的方法。我想保留属于索引128、133、140、143、199的行，并删除其他行。

我尝试做的事情：

dropped_data = dataset.drop(index != [128, 133, 140, 143, 199])

执行此操作时，我说错了

NameError: name 'index' is not defined

谁能告诉我我做错了什么？

Answer 1

为解释出现异常的原因，表达式

index != [128, 133, 140, 143, 199]

被评估为条件表达式，而不是将index视为关键字参数。 Python搜索变量index与列表进行比较。由于未定义index，因此您会看到NameError。

使用Index.difference来解决您的drop解决方案：

dataset.drop(index=df.index.difference([128, 133, 140, 143, 199]))

或者，更习惯地说，如果您有肯定的标签，则应使用loc 选择。

dataset.loc[[128, 133, 140, 143, 199]]
# If they are indexes,
# dataset.iloc[[128, 133, 140, 143, 199]]

Answer 2

正如@pault所说，在这里不能使用比较（！=），因为index是一个命名参数。我在这里要做的是创建所有索引的列表，例如：

indices = list(range(0, 200))

然后删除您要保留的内容：

for x in [128, 133, 140, 143, 199]:
    indices.remove(x)

现在，您有了要删除的所有索引的列表：

dropped_data = dataset.drop(index=indices)