Python转换数组

时间:2016-04-26 14:44:54

标签: python numpy pandas dataframe

我有这种数据框:

    Variable   Date     Value
0  Variable1  Date1  Valeur 1
1  Variable1  Date2  Valeur 2
2  Variable1  Date3  Valeur 3
3  Variable2  Date4  Valeur 4
4  Variable2  Date5  Valeur 5

我想像这样改造它:

    Date Variable1 Variable2
0  Date1  Valeur 1      None
1  Date2  Valeur 2      None
2  Date3  Valeur 3      None
3  Date4      None  Valeur 4
4  Date5      None  Valeur 5

如何使用panda或numpy在Python中进行这种转换? 谢谢你的帮助

2 个答案:

答案 0 :(得分:4)

我认为您需要pivot rename_axispandas 0.18.0中的新内容)和reset_index

print df.pivot(index='Date', columns='Variable', values='Value')
        .rename_axis(None, axis=1)
        .reset_index()

    Date Variable1 Variable2
0  Date1  Valeur 1      None
1  Date2  Valeur 2      None
2  Date3  Valeur 3      None
3  Date4      None  Valeur 4
4  Date5      None  Valeur 5

样品:

import pandas as pd

df = pd.DataFrame({'Variable': {0: 'a', 1: 'a', 2: 'a', 3: 'b', 4: 'b'}, 
                    'Date': {0: pd.Timestamp('2016-02-05 00:00:00'), 
                             1: pd.Timestamp('2016-02-06 00:00:00'),
                             2: pd.Timestamp('2016-02-07 00:00:00'), 
                             3: pd.Timestamp('2016-02-08 00:00:00'), 
                             4: pd.Timestamp('2016-02-09 00:00:00')}, 
                    'Value': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5}},
                    columns=['Variable','Date','Value'])

print df
  Variable       Date  Value
0        a 2016-02-05      1
1        a 2016-02-06      2
2        a 2016-02-07      3
3        b 2016-02-08      4
4        b 2016-02-09      5

print df.pivot(index='Date', columns='Variable', values='Value')
        .rename_axis(None, axis=1)
        .reset_index()

        Date    a    b
0 2016-02-05  1.0  NaN
1 2016-02-06  2.0  NaN
2 2016-02-07  3.0  NaN
3 2016-02-08  NaN  4.0
4 2016-02-09  NaN  5.0

答案 1 :(得分:0)

作为补充,一种在条件下拆分列的方法:

df=pd.DataFrame({'Variable':arange(5)},index=pd.date_range('2016/4/26',periods=5))
"""
            Variable
2016-04-26         0
2016-04-27         1
2016-04-28         2
2016-04-29         3
2016-04-30         4
"""

cond=df<3
df[cond].join(df[~cond],lsuffix=1,rsuffix=2)    
"""
            Variable1  Variable2
2016-04-26        0.0        NaN
2016-04-27        1.0        NaN
2016-04-28        2.0        NaN
2016-04-29        NaN        3.0
2016-04-30        NaN        4.0
"""