Question

许多threads之前已经要求重新排列pandas中的列。

我可以print我的专栏并在pandas中复制粘贴我需要重新排序的内容。但是，我想知道如果我有超过20列，我可以使用列号重新排列吗？我知道R可以这样做，

new_df <- my_df[,c(1:9, 11:21, 10)]

我在pandas尝试了相同的内容，但获得了SyntaxError，

new_df = my_df[[:, 1:9, 11:21, 10]]

我一直在搜索，找不到要参考的文件来获得答案。在pandas中我可以像R一样在一行中执行类似的操作吗？

Answer 1

我们可以使用np.r_[]：

new_df = my_df.iloc[:, np.r_[1:9, 11:21, 10]]

演示：

In [7]: df = pd.DataFrame(np.random.randint(5, size=(2, 21)))

In [8]: df
Out[8]:
   0   1   2   3   4   5   6   7   8   9  ...  11  12  13  14  15  16  17  18  19  20
0   2   1   2   2   1   0   4   4   4   2 ...   0   4   4   4   3   2   1   2   1   4
1   1   4   4   4   1   3   4   4   3   3 ...   1   2   3   4   2   0   1   0   2   1

[2 rows x 21 columns]

In [9]: df.iloc[:, np.r_[1:9, 11:21, 10]]
Out[9]:
   1   2   3   4   5   6   7   8   11  12  13  14  15  16  17  18  19  20  10
0   1   2   2   1   0   4   4   4   0   4   4   4   3   2   1   2   1   4   0
1   4   4   4   1   3   4   4   3   1   2   3   4   2   0   1   0   2   1   0

Answer 2

我认为您需要numpy.r_来获得concnecate指数：

new_df = my_df.iloc[:, np.r_[1:9, 11:21, 10]]

样品：

np.random.seed(100)
my_df = pd.DataFrame(np.random.randint(10, size=(3,30)))

new_df = my_df.iloc[:, np.r_[1:9, 11:21, 10]]
print (new_df)
   1   2   3   4   5   6   7   8   11  12  13  14  15  16  17  18  19  20  10
0   8   3   7   7   0   4   2   5   2   1   0   8   4   0   9   6   2   4   2
1   7   0   2   9   9   3   2   5   0   7   6   2   0   8   2   5   1   8   1
2   6   3   4   7   6   3   9   0   5   7   6   6   2   4   2   7   1   6   4

new_df = my_df.iloc[:, np.r_[1:10, 11:22, 10]]
print (new_df)
   1   2   3   4   5   6   7   8   9   11 ...  13  14  15  16  17  18  19  20  \
0   8   3   7   7   0   4   2   5   2   2 ...   0   8   4   0   9   6   2   4   
1   7   0   2   9   9   3   2   5   8   0 ...   6   2   0   8   2   5   1   8   
2   6   3   4   7   6   3   9   0   4   5 ...   6   6   2   4   2   7   1   6   

   21  10  
0   1   2  
1   1   1  
2   6   4  

[3 rows x 21 columns]

range的解决方案：

a = list(range(1,10)) + list(range(11,22)) + [10]
new_df = my_df.iloc[:, a]
print (new_df)
   1   2   3   4   5   6   7   8   9   11 ...  13  14  15  16  17  18  19  20  \
0   8   3   7   7   0   4   2   5   2   2 ...   0   8   4   0   9   6   2   4   
1   7   0   2   9   9   3   2   5   8   0 ...   6   2   0   8   2   5   1   8   
2   6   3   4   7   6   3   9   0   4   5 ...   6   6   2   4   2   7   1   6   

   21  10  
0   1   2  
1   1   1  
2   6   4  

[3 rows x 21 columns]

使用列号重排列 - Pythonic方式

2 个答案: