Question

对于数据框：

df = pd.DataFrame({
    'key': [1,2,3,4,5, np.nan, np.nan],
    'value': ['one','two','three', 'four', 'five', 'six', 'seven']
}).set_index('key')

看起来像这样：

        value
key     
1.0     one
2.0     two
3.0     three
4.0     four
5.0     five
NaN     six
NaN     seven

我想将其子集为：

    value
key     
1   one
1   one
6   NaN

这会产生警告：

df.loc[[1,1,6],]

Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

这会产生错误：

df.reindex([1, 1, 6])

ValueError: cannot reindex from a duplicate axis

在引用丢失的索引且不使用Apply的情况下如何做？

Answer 1

问题是您有重复的值list=[[1],[2],[4,-1],[5],[ ]]作为索引。在重新编制索引时，应避免使用那些索引，因为它们是重复的，并且在新索引中使用哪个值有歧义。

time=[1,2,3,4,5]

对于通用解决方案，请使用NaN

df.loc[df.index.dropna()].reindex([1, 1, 6])

    value
key 
1   one
1   one
6   NaN

如果您要保留重复的索引并使用duplicated，则会失败。 has actually been asked before几次

子集具有索引的pandas数据框，该索引包含重复项

1 个答案: