如何有效地重命名/缩短熊猫中的很多列?

时间:2016-05-05 07:52:46

标签: python pandas

我正在寻找一种在我的数据框中缩短/重命名列名的有效方法。我有很多列,所以手动操作不是一个选项。 我的数据与此类似

df.head(2):
            (spyhr, 10:00:00)  (spyhr, 11:00:00)  (spyhr, 12:00:00)  
date                                                                  
2009-02-20           0.004808          -0.003776          -0.004145   
2009-02-23          -0.007931          -0.015349          -0.002748

df.column.values:

array([('spyhr', datetime.time(10, 0)), ('spyhr', datetime.time(11, 0)),
       ('spyhr', datetime.time(12, 0))], dtype=object)

我希望我的列名称如下:

                     spyhr10             spyhr11           spyhr12
date                                                                  
2009-02-20           0.004808          -0.003776          -0.004145   
2009-02-23          -0.007931          -0.015349          -0.002748

3 个答案:

答案 0 :(得分:3)

import re

pattern = re.compile('^\((\w+), (\d+):.*')
df.columns = df.columns.map(lambda name: pattern.sub('\\1\\2', name))

答案 1 :(得分:1)

我尝试编辑您之前的question

您需要将time列更改为strftime字符串hours,然后将列表理解更改为join

print df
                 Time     spyhr         b         c
0 2009-02-20 11:00:00 -0.003776 -0.001606 -0.000150
1 2009-02-20 12:00:00 -0.004145  0.007597 -0.000054
2 2009-02-20 13:00:00 -0.007896  0.017419 -0.000241
3 2009-02-23 11:00:00 -0.015349  0.010237 -0.000328
4 2009-02-23 12:00:00 -0.002748  0.004150 -0.000070
5 2009-02-23 13:00:00 -0.007760  0.011192 -0.000270

df['time'] = df['Time'].dt.strftime('%H')
df['date'] = df['Time'].dt.date

print type(df.at[0,'time'])
<type 'str'>

print df
                 Time     spyhr         b         c time        date
0 2009-02-20 11:00:00 -0.003776 -0.001606 -0.000150   11  2009-02-20
1 2009-02-20 12:00:00 -0.004145  0.007597 -0.000054   12  2009-02-20
2 2009-02-20 13:00:00 -0.007896  0.017419 -0.000241   13  2009-02-20
3 2009-02-23 11:00:00 -0.015349  0.010237 -0.000328   11  2009-02-23
4 2009-02-23 12:00:00 -0.002748  0.004150 -0.000070   12  2009-02-23
5 2009-02-23 13:00:00 -0.007760  0.011192 -0.000270   13  2009-02-23
df = pd.pivot_table(df, index='date', columns='time')
#list comprehension
df.columns = [''.join(col) for col in df.columns]
print df
             spyhr11   spyhr12   spyhr13       b11       b12       b13  \
date                                                                     
2009-02-20 -0.003776 -0.004145 -0.007896 -0.001606  0.007597  0.017419   
2009-02-23 -0.015349 -0.002748 -0.007760  0.010237  0.004150  0.011192   

                 c11       c12       c13  
date                                      
2009-02-20 -0.000150 -0.000054 -0.000241  
2009-02-23 -0.000328 -0.000070 -0.000270  

答案 2 :(得分:-1)

rename = [cname[1:5]+cname[8:9] for cname in df.columns]
df.columns = rename