这是我的数据的样子:
print(len(y_train),len(index_1))
index_1 = pd.DataFrame(data=index_1)
print("y_train: ")
print(y_train)
print("index_1: ")
print(index_1)
输出:
1348 555
y_train:
1677 1
1519 0
1114 0
690 1
1012 1
..
1893 1
1844 0
1027 1
1649 1
1789 1
Name: Team 1 Win, Length: 1348, dtype: int64
index_1:
0
0 0
1 2
2 6
3 7
4 8
.. ...
550 1335
551 1341
552 1342
553 1344
554 1346
我想从熊猫数据框(y_train)中删除许多行(index_1)。因此index_1 df中的值是我要删除的行。问题在于数据帧的顺序不正确,因此当index_1的第一项为0时,我希望它删除y_train中的第一行(即索引1677),而不是索引为0的行。 这是我的尝试:
y_train_short = y_train.drop(index_1)
这就是我得到的:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-57-49f2cce7bac0> in <module>
22 print(index_1)
23 print(index_1)
---> 24 y_train_short = y_train.drop(index_1)
25
26
~/miniconda3/lib/python3.7/site-packages/pandas/core/series.py in drop(self, labels, axis, index, columns, level, inplace, errors)
4137 level=level,
4138 inplace=inplace,
-> 4139 errors=errors,
4140 )
4141
~/miniconda3/lib/python3.7/site-packages/pandas/core/generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
3934 for axis, labels in axes.items():
3935 if labels is not None:
-> 3936 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
3937
3938 if inplace:
~/miniconda3/lib/python3.7/site-packages/pandas/core/generic.py in _drop_axis(self, labels, axis, level, errors)
3968 new_axis = axis.drop(labels, level=level, errors=errors)
3969 else:
-> 3970 new_axis = axis.drop(labels, errors=errors)
3971 result = self.reindex(**{axis_name: new_axis})
3972
~/miniconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in drop(self, labels, errors)
5016 if mask.any():
5017 if errors != "ignore":
-> 5018 raise KeyError(f"{labels[mask]} not found in axis")
5019 indexer = indexer[~mask]
5020 return self.delete(indexer)
KeyError: '[0] not found in axis'
独立于y_train中不存在索引0的事实,我想象如果它存在,它将不会执行我想要的操作。那么如何从此数据框中删除正确的行?
答案 0 :(得分:1)
请注意,y_train.iloc[index_1[0]]
从 y_train 中检索行。
占据指示的整数位置。
运行y_train.iloc[index_1[0]].index
时,您将获得
这些行中的索引。
因此请删除这些行,您可以运行:
y_train.drop(y_train.iloc[index_1[0]].index, inplace=True)
答案 1 :(得分:0)
您可以在索引上使用isin
# set index to start from 0
y_train = y_train.reset_index(drop=True)
# do simple filter
y_train = y_train[~y_train.index.isin(index_1[0])].copy()