import pandas as pd
from datetime import datetime
df = pd.DataFrame({'origin': ['japan', 'japan','japan','japan'],
'pastime': ['baseball', 'sumo', 'keirin', 'football'],
datetime(2000,1,1) : [4,5,4,5],
datetime(2005,1,1) : [4, 3, 2, 1],
datetime(2010,1,1) : [4, 2, 2, 1]
})
我的数据框有很多日期标记的列:
Index(['origin','pastime', 2000-01-01 00:00:00,
2005-01-01 00:00:00, 2010-01-01 00:00:00],
dtype='object')
我想重新设置数据框以包含列:origin, pastime, date, value
第一个输入行是:
origin = japan
pastime = baseball
date = 2001-01-01
value = 4
我已经看过使用stack
将列作为索引推送到行中的示例,但在我的情况下,它会推动'来源'和消遣'列也下来了。
我将如何进行这种转变?
答案 0 :(得分:4)
我认为您正在寻找melt
:
df.melt(['origin', 'pastime'], var_name='date')
origin pastime date value
0 japan baseball 2005-01-01 4
1 japan sumo 2005-01-01 3
2 japan keirin 2005-01-01 2
3 japan football 2005-01-01 1
4 japan baseball 2010-01-01 4
5 japan sumo 2010-01-01 2
6 japan keirin 2010-01-01 2
7 japan football 2010-01-01 1
8 japan baseball 2000-01-01 4
9 japan sumo 2000-01-01 5
10 japan keirin 2000-01-01 4
11 japan football 2000-01-01 5
答案 1 :(得分:3)
set_index
和stack
df.set_index(['origin','pastime']).stack().reset_index()
Out[150]:
origin pastime level_2 0
0 japan baseball 2010-01-01 4
1 japan baseball 2000-01-01 4
2 japan baseball 2005-01-01 4
3 japan sumo 2010-01-01 2
4 japan sumo 2000-01-01 5
5 japan sumo 2005-01-01 3
6 japan keirin 2010-01-01 2
7 japan keirin 2000-01-01 4
8 japan keirin 2005-01-01 2
9 japan football 2010-01-01 1
10 japan football 2000-01-01 5
11 japan football 2005-01-01 1
PS。您可以使用rename
更改列名