此数据框DF:
Stock Date Time Price Open
AAA 2002-02-23 10:13 2.440 0.01
AAA 2002-02-27 17:17 2.460 0.02
成为:Transformed
Stock Date Time_0 Price_0 Open_0 Time_1 Price_1 Open_1
AAA 2002-02-23 10:13 2.440 0.01 17:17 2.460 0.02
AAA 2002-02-27 17:17 2.460 0.02 NA NA NA
我想对更大的数据集应用上述操作是否有一种有效的方法来做到这一点? (图像有更详细的表示)
编辑:解决方案 How to create a lagged data structure using pandas dataframe 这回答了问题
答案 0 :(得分:0)
数据设置:
df = pd.DataFrame({'Stock': {0: 'AAA', 1: 'AAA', 2: 'AAA'},
'Date': {0: '2002-02-23', 1: '2002-02-27', 2: '2002-02-27'},
'Time': {0: '10:13', 1: '17:17', 2: '17:17'},
'Price': {0: 2.44, 1: 2.46, 2: 3.2},
'Open': {0: 0.01, 1: 0.02, 2: 0.02}
})
#Reorder columns
df = df[['Stock','Date','Time','Price','Open']]
df
Out[1221]:
Stock Date Time Price Open
0 AAA 2002-02-23 10:13 2.44 0.01
1 AAA 2002-02-27 17:17 2.46 0.02
2 AAA 2002-02-27 17:17 3.20 0.02
<强>解决方案:强>
#get the 'Time', 'Price','Open' fileds from the next row and create a new dataframe
df_1 = df.apply(lambda x: df.ix[x.name+1][['Time', 'Price','Open']] if (x.name+1) < len(df) else np.nan , axis=1)
#join the original df and the new df
df.join(df_1,lsuffix='_0',rsuffix='_1')
Out[1223]:
Stock Date Time_0 Price_0 Open_0 Time_1 Price_1 Open_1
0 AAA 2002-02-23 10:13 2.44 0.01 17:17 2.46 0.02
1 AAA 2002-02-27 17:17 2.46 0.02 17:17 3.20 0.02
2 AAA 2002-02-27 17:17 3.20 0.02 NaN NaN NaN
使用OP的原始数据,输出将是:
Out[1270]:
Stock Date Time_0 Price_0 Open_0 Time_1 Price_1 Open_1
0 AAA 2002-02-23 10:13 2.44 0.01 17:17 2.46 0.02
1 AAA 2002-02-27 17:17 2.46 0.02 NaN NaN NaN