Question

假设我有一个熊猫数据框，其中包含列0、1和“未来的连接”。如何将第0列和第1列设置为一个元组索引：

例如此数据框：

0   1        Future Connection
6   840      0.0
4   197      1.0
620 979      0.0

将导致：

0           Future Connection
(6, 840)    0.0
(4, 197)    1.0
(620, 979)  0.0

Answer 1

如何将第0列和第1列设置为一个元组索引：

“元组索引”作为概念在熊猫中不存在。您可以有一个包含元组的object dtype索引，但是不建议这样做。最好的选择是使用MultiIndex，它可以通过NumPy数组有效地存储基础值。确实，熊猫通过set_index促进了这一点：

df = df.set_index([0, 1])

print(df)
#          Future Connection
# 0   1                     
# 6   840                0.0
# 4   197                1.0
# 620 979                0.0

print(df.index)
# MultiIndex(levels=[[4, 6, 620], [197, 840, 979]],
#            labels=[[1, 0, 2], [1, 0, 2]],
#            names=[0, 1])

print(df.index.values)
# [(6, 840) (4, 197) (620, 979)]

Answer 2

将列表理解与DataFrame.pop一起用于提取列0, 1：

print (df.columns)
Index([0, 1, 'Future Connection'], dtype='object')

df.index = [x for x in zip(df.pop(0), df.pop(1))]
print (df)
            Future Connection
(6, 840)                  0.0
(4, 197)                  1.0
(620, 979)                0.0

将两列设置为Pandas中的元组索引

2 个答案: