将索引级别添加到数据框

时间:2014-03-22 00:24:29

标签: python pandas

我有一个数据框,其中一个索引作为日期时间,如下所示,我希望添加第一列索引(请参阅下面的“目标”),其中任何日期都与之交叉(First_column)。

First_column = ['s0000', 's0001', 's0002', 's0003', 's0004', ...]

有人知道如何继续吗?

非常感谢你。 亚历

我的数据框:

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 17544 entries, 2015-01-01 00:00:00 to 2016-12-31 23:00:00
Data columns (total 12 columns):

目标:

<class 'pandas.core.frame.DataFrame'>
MultiIndex: 996000 entries, (s0000, 2015-01-01 00:00:00) to (s0999, 2012-12-31 00:00:00)
Data columns (total 8 columns):

情景日期

s0000    2015-02-28
         2015-03-03 
         2015-03-04
         2015-03-05
         2015-03-06
         2015-03-07
         2015-03-10
         2015-03-11
         2015-03-12
         2015-03-13
s0001    2015-02-28
         2015-03-03 
         2015-03-04
         2015-03-05
         2015-03-06
         2015-03-07
         2015-03-10
         2015-03-11
         2015-03-12
         2015-03-13
s0002    2015-02-28
         2015-03-03 
         2015-03-04
         2015-03-05
         2015-03-06
         2015-03-07
         2015-03-10
         2015-03-11
         2015-03-12
         2015-03-13
s0003    ...

2 个答案:

答案 0 :(得分:1)

您可以将pd.concatkeys参数一起使用:

import pandas as pd
df = pd.DataFrame(range(10), index=pd.date_range('2015-2-27', freq='B', periods=10))
#             0
# 2015-02-27  0
# 2015-03-02  1
# 2015-03-03  2
# 2015-03-04  3
# 2015-03-05  4
# 2015-03-06  5
# 2015-03-09  6
# 2015-03-10  7
# 2015-03-11  8
# 2015-03-12  9
first_col = ['s{:04d}'.format(i) for i in range(1,5)]
# ['s0001d', 's0002d', 's0003d', 's0004d']

newdf = pd.concat([df]*len(first_col), keys=first_col)
print(newdf)

产量

                  0
s0001 2015-02-27  0
      2015-03-02  1
      2015-03-03  2
      2015-03-04  3
      2015-03-05  4
      2015-03-06  5
      2015-03-09  6
      2015-03-10  7
      2015-03-11  8
      2015-03-12  9
s0002 2015-02-27  0
      2015-03-02  1
      2015-03-03  2
      2015-03-04  3
      2015-03-05  4
      2015-03-06  5
      2015-03-09  6
      2015-03-10  7
      2015-03-11  8
      2015-03-12  9
s0003 2015-02-27  0
      2015-03-02  1
      2015-03-03  2
      2015-03-04  3
      2015-03-05  4
      2015-03-06  5
      2015-03-09  6
      2015-03-10  7
      2015-03-11  8
      2015-03-12  9
s0004 2015-02-27  0
      2015-03-02  1
      2015-03-03  2
      2015-03-04  3
      2015-03-05  4
      2015-03-06  5
      2015-03-09  6
      2015-03-10  7
      2015-03-11  8
      2015-03-12  9

很高兴,我刚学会了这个yesterday from Joris

答案 1 :(得分:0)

你可以这样做......

import pandas as pd

first_col = ['s0001', 's0002', 's0003', 's0004']

# Make your datetime index
dt_index = pd.date_range('2015-2-27', freq='B', periods=10)

# Make your first_col index - must be same length as dt_index 
first_col_index = len(dt_index)*first_col
first_col_index.sort()

# Make a dateframe with a hierarchical index
df = pd.DataFrame(range(len(first_col)*len(dt_index)), index=[first_col_index,
                  dt_index.repeat(len(first_col))])