Question

我有一个Python Dataframe（名为dfFull），它输出以下内容：

Email         System 6 System 7 System 1 System 4 Count System 5 System 3 System 2
test1@test.com  1         0         0        0       2     1         0       0
test2@test.com  0         1         0        1       3     0         1       0
test3@test.com  0         0         1        1       4     1         0       1

系统数量不同（系统的数量在代码中较早计算，并且等于变量SystemCount）。我想重新构建Dataframe，使其首先包含电子邮件和计数列，而不是之后的所有系统。

我认为使用for循环最适合这个，并在下面设置循环，但是我不知道在循环中放入什么，因为我首先想要电子邮件和计数列（Python新手）。我也知道sort_values（）可能会起作用，但即使使用python文档我也无法使参数正常工作

for count in range(1, int(SystemCount)+1): #counts up to the system amount

预期输出将按此顺序排列及其内容：

Email Count System 1 ..... System 8

Answer 1

您可以通过第一列名称的lambda key function中的所有列创建difference：

c = ['Email','Count']
c1 = df.columns.difference(c)
cols = c +  sorted(c1, key=lambda x: int(x.split()[1]))
print (cols)
['Email', 'Count', 'System 1', 'System 2', 'System 3', 
 'System 4', 'System 5', 'System 6', 'System 7']

df = df[cols]
print (df)
            Email  Count  System 1  System 2  System 3  System 4  System 5  \
0  test1@test.com      2         0         0         0         0         1   
1  test2@test.com      3         0         0         1         1         0   
2  test3@test.com      4         1         1         0         1         1   

   System 6  System 7  
0         1         0  
1         0         1  
2         0         0

如何使用for循环对数据框列进行排序

1 个答案: