Question

在我的主df中，我有一个列与另外两列相结合，创建如下所示的值：A1_43567_1。第一个数字代表一种评估，第二个数字是问题ID，最后一个数字是评估的问题位置。我计划创建一个数据透视表，将每个唯一值作为一列，以查看每个项目的多个学生的选择。但我希望枢轴的顺序是问题位置，或连接中的第三个值。基本上这个输出：

    Student ID  A1_45678_1  A1_34551_2  A1_11134_3  etc....
    12345           1            0          0      
    12346           0            0          1
    12343           1            1          0

我尝试按原始列排序我的数据框，我希望它按（问题位置）排序，然后创建数据透视表，但这不会呈现我正在寻找的上述结果。有没有办法按列中的第三个值对原始连接值进行排序？或者是否可以按每列中的第三个值对数据透视表进行排序？

目前的代码是：

   demo_pivot.sort(['Question Position'], ascending=True)

   demo_pivot['newcol'] = 'A' + str(interim_selection) + '_' + ,\
   demo_pivot['Item ID'].map(str) + "_" + demo_pivot['Question Position'].map(str)

   demo_pivot= pd.pivot_table(demo_pivot, index='Student ANET ID',values='Points Received',\
   columns='newcol').reset_index()

但生成此输出：

    Student ID  A1_45678_1  A1_34871_7  A1_11134_15  etc....
    12345           1            0          0      
    12346           0            0          1
    12343           1            1          0

Answer 1

对pd.pivot_table()的调用会返回一个DataFrame，对吗？如果是这样，您可以重新排序生成的DataFrame的列吗？类似的东西：

def sort_columns(column_list):
    # Create a list of tuples: (question position, column name)
    sort_list = [(int(col.split('_')[2]), col) for col in column_list]

    # Sorts by the first item in each tuple, which is the question position
    sort_list.sort() 

    # Return the column names in the sorted order:
    return [x[1] for x in sort_list]

# Now, you should be able to reorder the DataFrame like so:
demo_pivot = demo_pivot.loc[:, sort_columns(demo_pivot.columns)]

按具有多个值

1 个答案: