我的数据框看起来像:
Region Date Drip Coffee Espresso Latte Other
Central 1 5 1 2 3
East 1 3 3 1 4
North 1 5 1 3 2
Central 2 2 7 2 0
East 2 10 3 2 1
North 2 6 9 4 2
.
.
.
我想旋转Drip Coffee,Espresso,Latte和其他饮料,以使其与Date和Region重复,这样排列:
Region Date Type Value
Central 1 Drip Coffee 5
East 1 Drip Coffee 3
North 1 Drip Coffee 5
Central 1 Espresso 1
East 1 Espresso 3
North 1 Espresso 1
.
.
.
Central 2 Drip Coffee 2
East 2 Drip Coffee 10
North 2 Drip Coffee 6
.
.
我尝试了一些方法,例如:
df_new = df_old.pivot(index='Date',columns=['Drip Coffee', 'Espresso', 'Latte', 'Other']).stack(0).rename_axis(['Date','Type']).reset_index()
但这给了我ValueError: all arrays must be same length
我知道我在Value
的测试中丢失了一个新列,但这是因为我不知道如何枢转像这样的一系列值。
我想看看是否有可能的解决方法,因为这个问题似乎很独特。我在那里找不到这样的多重重复解决方案。
答案 0 :(得分:1)
设置
d = {'id_vars': ['Region', 'Date'], 'var_name': 'Type', 'value_name': 'Value'}
IIUC,使用melt
和sort_values
。
df.melt(**d).sort_values(by=['Date', 'Type'])
Region Date Type Value
0 Central 1 Drip Coffee 5
1 East 1 Drip Coffee 3
2 North 1 Drip Coffee 5
6 Central 1 Espresso 1
7 East 1 Espresso 3
8 North 1 Espresso 1
12 Central 1 Latte 2
13 East 1 Latte 1
14 North 1 Latte 3
18 Central 1 Other 3
19 East 1 Other 4
20 North 1 Other 2
3 Central 2 Drip Coffee 2
4 East 2 Drip Coffee 10
5 North 2 Drip Coffee 6
9 Central 2 Espresso 7
10 East 2 Espresso 3
11 North 2 Espresso 9
15 Central 2 Latte 2
16 East 2 Latte 2
17 North 2 Latte 4
21 Central 2 Other 0
22 East 2 Other 1
23 North 2 Other 2