Question

我正在尝试对行和列进行转置，但不知道该如何完成。

以下是具有原始数据和所需输出的数据表： https://drive.google.com/file/d/1HLBSCdziga3gJtCkNEpx-paFO9eHHYJy/view?usp=sharing

保持“日期”列不变，并将“来源：电子邮件”，“来源：社交”和“来源：显示”列更改为行
现在需要将操作行（滚动，单击，滑动）更改为列
此外，我想创建一个新列以添加所有操作。

这是我可以使用Python（熊猫）做的事情吗？如果您能在此方面帮助我，我将不胜感激。

非常感谢您！泰

Answer 1

您可以这样做：

df = pd.DataFrame({ "Date": ["10/1/2018","10/2/2018","10/2/2018","10/3/2018"],
               "Website Action" : ["Scroll","Scroll","Click","Swipe"],
               "Source:Email" : [1,1,3,2],
               "Source:Social" : [4,2,10,6],
               "Source:Display" : [5,3,3,9]})

仅设置示例框架（列顺序略有偏离）：

    Date    Source:Display  Source:Email    Source:Social   Website Action
0   10/1/2018   5   1   4   Scroll
1   10/2/2018   3   1   2   Scroll
2   10/2/2018   3   3   10  Click
3   10/3/2018   9   2   6   Swipe

您现在可以结合使用“ melt”和“ pivot_table”来获得所需的内容：

df.melt(id_vars=["Date","Website Action"]).pivot_table(index = ["Date","variable"], columns = "Website Action").fillna(0)

这将产生：

        value
Website Action  Click   Scroll  Swipe
Date    variable            
10/1/2018   Source:Display  0.0 5.0 0.0
            Source:Email    0.0 1.0 0.0
            Source:Social   0.0 4.0 0.0
10/2/2018   Source:Display  3.0 3.0 0.0
            Source:Email    3.0 1.0 0.0
            Source:Social   10  2.0 0.0
10/3/2018   Source:Display  0.0 0.0 9.0
            Source:Email    0.0 0.0 2.0
            Source:Social   0.0 0.0 6.0

重新排序和重命名，由您自己决定：-）

Answer 2

我的解决方案可能有点麻烦：

import pandas as pd
df = pd.read_table('A/tab/separated/file/with/your/data.tsv')

#Stack your columns as an extra index, then unstack one of your indexes into columns
reshaped_df = df.set_index(['Date', 'Website Action']).stack().unstack(level=1).fillna(0).reset_index()

# Rename the columns, calculate total engagements
reshaped_df.columns = ['Date','Source','Click','Scroll','Swipe']
reshaped_df['Total Engagements'] = reshaped_df[['Click','Scroll','Swipe']].sum(axis=1)

长话短说，是的，大熊猫可以做到，上面是一个例子。我建议在交互式外壳中运行所有内容（检查应用set_index时发生的情况，堆栈或取消堆栈时发生的情况）。

移调-切换行和列

2 个答案: