Question

我希望按照pd.DataFrame定义的顺序保留列。在下面的示例中，df.info显示 GroupId 是第一列，print也打印 GroupId 。我正在使用Python版本3.6.3

import numpy as np
import pandas as pd

df = pd.DataFrame({'Id' : np.random.randint(1,100,10), 
                       'GroupId' : np.random.randint(1,5,10) })
df.info()
print(df.iloc[:,0])

Answer 1

一种方法是使用collections.OrderedDict，如下所示。请注意，OrderedDict对象将元组列表作为输入。

from collections import OrderedDict

df = pd.DataFrame(OrderedDict([('Id', np.random.randint(1,100,10)), 
                               ('GroupId', np.random.randint(1,5,10))]))

#    Id  GroupId
# 0  37        4
# 1  10        2
# 2  42        1
# 3  97        2
# 4   6        4
# 5  59        2
# 6  12        2
# 7  69        1
# 8  79        1
# 9  17        1

Answer 2

除非你使用python-3.6 +订购字典，否则这对于（标准）字典是不可能的。您需要将zip项目放在一起并传递元组列表：

np.random.seed(0)

a = np.random.randint(1, 100, 10)
b = np.random.randint(1, 5, 10)

df = pd.DataFrame(list(zip(a, b)), columns=['Id', 'GroupId'])

或者，

data = [a, b]
df = pd.DataFrame(list(zip(*data)), columns=['Id', 'GroupId']))

df
   Id  GroupId
0  45        3
1  48        1
2  65        1
3  68        1
4  68        3
5  10        2
6  84        3
7  22        4
8  37        4
9  88        3

在DataFrame创建时保持列顺序

2 个答案: