在函数内重命名Pandas DataFrame不起作用

时间:2018-01-11 10:16:12

标签: python pandas dataframe

我想实现此功能,以便使用新名称创建新列。如果我逐行应用代码完美无缺。如果我运行该函数,则行lag.columns = [rename]不起作用。

发生了什么事?

T  = [50, 48, 47, 49, 51, 53, 54, 52]
v1 = [1, 3, 2, 4, 5, 5, 6, 2] 
v2 = [2, 5, 4, 2, 3, 1, 6, 9]

dataframe = pd.DataFrame({'T': T, 'v1': v1, 'v2': v2})


def timeseries_to_supervised(data, ts=1, dropnan=True):
    '''
    Helper function to convert a timeseries dataframe to supervised
    The response must be placed as the first column
    Arguments:
        :data --> dataframe to transform into supervised
        :timesteps --> number of timesteps we want to shift
    Returns:
        :final --> numpy array transformed        
    ''' 
    # n_vars = 1 if type(data) is list else data.shape[1]
    # y = data.loc[1]

    # Create lags
    for i, col in enumerate(list(data)):

        name = col
        rename = name + '(t-1)'
        lag  = pd.DataFrame(data.iloc[:, i]).shift(1)
        lag.colums = [rename]
        data = pd.concat([data, lag], axis=1)

    return data

reframed = timeseries_to_supervised(dataframe, 1)

因此,它使用新列返回数据框,但列的名称不包括更改部分。

提前致谢!

2 个答案:

答案 0 :(得分:1)

这对我有用:

import pandas as pd
T  = [50, 48, 47, 49, 51, 53, 54, 52]
v1 = [1, 3, 2, 4, 5, 5, 6, 2] 
v2 = [2, 5, 4, 2, 3, 1, 6, 9]

dataframe = pd.DataFrame({'T': T, 'v1': v1, 'v2': v2})


def timeseries_to_supervised(data, ts=1, dropnan=True):

    # n_vars = 1 if type(data) is list else data.shape[1]
    # y = data.loc[1]

    # Create lags
    for i, col in enumerate(list(data)):

        name = col
        rename = name + '(t-1)'
        lag = pd.DataFrame(data.iloc[:, i].shift(1).values, columns=[rename], index=data.index)
        data = pd.concat([data, lag], axis=1)

    return data

reframed = timeseries_to_supervised(dataframe, 1)
print reframed

仅改变了创建新滞后的方式。这给了我:

   T   v1  v2   T(t-1)  v1(t-1)  v2(t-1)
0  50   1   2     NaN      NaN      NaN
1  48   3   5    50.0      1.0      2.0
2  47   2   4    48.0      3.0      5.0
3  49   4   2    47.0      2.0      4.0
4  51   5   3    49.0      4.0      2.0
5  53   5   1    51.0      5.0      3.0
6  54   6   6    53.0      5.0      1.0
7  52   2   9    54.0      6.0      6.0

答案 1 :(得分:1)

你有一个错字:

lag.colums = [rename]

这应该是:

lag.columns = [rename]

这对我有用,这是我的输出:

    T  v1  v2  T(t-1)  v1(t-1)  v2(t-1)
0  50   1   2     NaN      NaN      NaN
1  48   3   5    50.0      1.0      2.0
2  47   2   4    48.0      3.0      5.0
3  49   4   2    47.0      2.0      4.0
4  51   5   3    49.0      4.0      2.0
5  53   5   1    51.0      5.0      3.0
6  54   6   6    53.0      5.0      1.0
7  52   2   9    54.0      6.0      6.0