Python 2.7:按日移动数据帧和列值

时间:2017-08-08 09:41:49

标签: python python-2.7 pandas dataframe

我有一个名为df1的数据框如下:

DF1:

               a   b    id
2010-01-01     2   3    21
2010-01-01     2   4    22
2010-01-01     3   5    23
2010-01-01     4   6    24
2010-01-02     1   4    21
2010-01-02     2   5    22
2010-01-02     3   6    23
2010-01-02     4   7    24
2010-01-03     1   8    21
2010-01-03     2   9    22
2010-01-03     3   10    23
2010-01-03     4   11   24
...........................

我想移动a,b和id的值,i行值变为i + 1行值。正如你可以看到df1,同一个日期有几行,而id是不同的。我想转移df1,我的意思是2010-01-02值是基于id的2010-01-03值(我的意思是2010-01-02值为id 21,是2010-01- 03的值为21)。谢谢!

我想要的答案:

                a   b    id
2010-01-01     Nan   Nan    Nan
2010-01-01     Nan   Nan    Nan
2010-01-01     Nan   Nan    Nan
2010-01-01     Nan   Nan    Nan
2010-01-02     2   3    21
2010-01-02     2   4    22
2010-01-02     3   5    23
2010-01-02     4   6    24
2010-01-03     1   4    21
2010-01-03     2   5    22
2010-01-03     3   6    23
2010-01-03     4   7    24
...........................

2 个答案:

答案 0 :(得分:2)

其中一种方法是在形状的帮助下如果日期被分类,即

df.shift(df.loc[df.index[0]].shape[0])
# Or len 
df.shift(len(df.loc[df.index[0]]))

输出:

              a    b    id
2010-01-01  NaN  NaN   NaN
2010-01-01  NaN  NaN   NaN
2010-01-01  NaN  NaN   NaN
2010-01-01  NaN  NaN   NaN
2010-01-02  2.0  3.0  21.0
2010-01-02  2.0  4.0  22.0
2010-01-02  3.0  5.0  23.0
2010-01-02  4.0  6.0  24.0
2010-01-03  1.0  4.0  21.0
2010-01-03  2.0  5.0  22.0
2010-01-03  3.0  6.0  23.0
2010-01-03  4.0  7.0  24.0

答案 1 :(得分:2)

如果所有组的长度相同(在样本4中)并且DatetimeIndex已排序:

df2 = df.shift((df.index == df.index[0]).sum())
print (df2)
              a    b    id
2010-01-01  NaN  NaN   NaN
2010-01-01  NaN  NaN   NaN
2010-01-01  NaN  NaN   NaN
2010-01-01  NaN  NaN   NaN
2010-01-02  2.0  3.0  21.0
2010-01-02  2.0  4.0  22.0
2010-01-02  3.0  5.0  23.0
2010-01-02  4.0  6.0  24.0
2010-01-03  1.0  4.0  21.0
2010-01-03  2.0  5.0  22.0
2010-01-03  3.0  6.0  23.0
2010-01-03  4.0  7.0  24.0

但是如果需要将索引的值移动一天:

df3 = df.shift(1, freq='D')
print (df3)
            a   b  id
2010-01-02  2   3  21
2010-01-02  2   4  22
2010-01-02  3   5  23
2010-01-02  4   6  24
2010-01-03  1   4  21
2010-01-03  2   5  22
2010-01-03  3   6  23
2010-01-03  4   7  24
2010-01-04  1   8  21
2010-01-04  2   9  22
2010-01-04  3  10  23
2010-01-04  4  11  24