Question

我已经尝试了多种方法来读入这个excel文件并用熊猫重塑它。我尝试了不同的功能，例如merge（），pivot（），melt（），reset_index（），但我仍然无法弄清楚。谁能指出我正确的方向？这是当前表： current

这是所需的输出： desired output

很抱歉格式化。我是stackoverflow的新手，但我已经进行了研究，似乎无法找出答案。

我尝试了很多已删除的代码，但是在这里不起作用，这是我尝试执行的一些示例。

    import pandas as pd
    df = pd.read_excel(file)
    df.iloc[0:,0].fillna(method= 'ffill', inplace = True)
    new_cols = df.columns[2:]
    df = df.rename(columns = {"Unnamed: 1":"to col"})

end_file_cols是一个列表，其中包含“所需”图片中的列

    df = df.reindex(columns = end_file_cols)
    df['Demo'] = df.index.tolist()
    df.pivot(index = 'Media', columns = new_cols.tolist())

这是在打印df时发生的事情

    import pandas as pd
    df = pd.read_excel(file)
    df.iloc[0:,0].fillna(method= 'ffill', inplace = True)
    new_cols = df.columns[2:]
    df = df.rename(columns = {"Unnamed: 1":"to col"})
    print(df)

    Media      to col  Age Group 1  Age Group 2  Age Group 3  Age Group 4
0  Plan 1  Total Cost           65            4           90           88
1  Plan 1    Net Loss           88           77           85           85
2  Plan 1       Views           60           97           76           82
3  Plan 2  Total Cost           96           92            5            0
4  Plan 2    Net Loss           89           77           51           59
5  Plan 2      Budget           42           67           49           96
6  Plan 3  Total Cost           22           78          100           10
7  Plan 3    Net Prof           59           33           72           87

Answer 1

您可以stack和unstack在列和行之间更改MultiIndex。简单地做，

df = pd.read_excel('data.xlsx', index_col=[0,1])
new_df = df.unstack().stack(level=0)

只需重命名索引，

new_df.index.rename(('Media','Demo'), inplace=True)

空值将为np.NaN，可以使用new_df.fillna(<value>)（可选）替换为您想要的任何值

创建多索引并将带有列的熊猫作为额外索引转置数据

1 个答案: