仅将一些列重新整理为单个列

时间:2017-11-17 04:16:53

标签: python pandas

import pandas as pd
from datetime import datetime

df = pd.DataFrame({'origin': ['japan', 'japan','japan','japan'],
                       'pastime': ['baseball', 'sumo', 'keirin', 'football'],
                       datetime(2000,1,1) : [4,5,4,5],
                   datetime(2005,1,1) : [4, 3, 2, 1],
                   datetime(2010,1,1) : [4, 2, 2, 1]
                  })

我的数据框有很多日期标记的列:

Index(['origin','pastime', 2000-01-01 00:00:00,
       2005-01-01 00:00:00, 2010-01-01 00:00:00],
      dtype='object')

我想重新设置数据框以包含列:origin, pastime, date, value

第一个输入行是:

origin = japan
pastime = baseball
date = 2001-01-01
value = 4

我已经看过使用stack将列作为索引推送到行中的示例,但在我的情况下,它会推动'来源'和消遣'列也下来了。

我将如何进行这种转变?

2 个答案:

答案 0 :(得分:4)

我认为您正在寻找melt

df.melt(['origin', 'pastime'], var_name='date')

   origin   pastime       date  value
0   japan  baseball 2005-01-01      4
1   japan      sumo 2005-01-01      3
2   japan    keirin 2005-01-01      2
3   japan  football 2005-01-01      1
4   japan  baseball 2010-01-01      4
5   japan      sumo 2010-01-01      2
6   japan    keirin 2010-01-01      2
7   japan  football 2010-01-01      1
8   japan  baseball 2000-01-01      4
9   japan      sumo 2000-01-01      5
10  japan    keirin 2000-01-01      4
11  japan  football 2000-01-01      5

答案 1 :(得分:3)

set_indexstack

df.set_index(['origin','pastime']).stack().reset_index()
Out[150]: 
   origin   pastime    level_2  0
0   japan  baseball 2010-01-01  4
1   japan  baseball 2000-01-01  4
2   japan  baseball 2005-01-01  4
3   japan      sumo 2010-01-01  2
4   japan      sumo 2000-01-01  5
5   japan      sumo 2005-01-01  3
6   japan    keirin 2010-01-01  2
7   japan    keirin 2000-01-01  4
8   japan    keirin 2005-01-01  2
9   japan  football 2010-01-01  1
10  japan  football 2000-01-01  5
11  japan  football 2005-01-01  1

PS。您可以使用rename更改列名