我有一个df:
https://login.microsoftonline.com/{tenant}/oauth2/authorize
我的预期输出是(列中的每个元素,在一行上有接下来的3个值):
0 1 2 3 4
0 44.000000 0.0 0.0 0.0 0.0
1 42.200001 0.0 0.0 0.0 0.0
2 44.799999 0.0 0.0 0.0 0.0
3 47.520000 0.0 0.0 0.0 0.0
4 49.760000 0.0 0.0 0.0 0.0
5 53.420000 0.0 0.0 0.0 0.0
我在这里尝试做的很快(任务是要达到一定的速度),我在想是否创建一个空的df,然后使用.apply(lambda:#fncineed, axis = 1)它可以更好地提高性能。 (而不是通过整个数据索引并将fnc应用于移动窗口)
答案 0 :(得分:3)
如果我理解你的问题,请试试这个:
df.apply(lambda x: df['0'].shift(-df.columns.get_loc(x.name)))
输出:
0 1 2 3 4
0 44.000000 42.200001 44.799999 47.52 49.76
1 42.200001 44.799999 47.520000 49.76 53.42
2 44.799999 47.520000 49.760000 53.42 NaN
3 47.520000 49.760000 53.420000 NaN NaN
4 49.760000 53.420000 NaN NaN NaN
5 53.420000 NaN NaN NaN NaN
答案 1 :(得分:1)
如果您希望每列1-3
包含列0
后面的三个值,请按行排序,您可以使用shift()
:
n = 3
pd.concat([df.iloc[:,0],
df.iloc[:,1:].apply(lambda x: (df.iloc[:,0]
.shift(-int(x.name))[:n])
.iloc[:,:n]]), axis=1)
0 1 2 3
0 44.000000 42.200001 44.799999 47.52
1 42.200001 44.799999 47.520000 49.76
2 44.799999 47.520000 49.760000 53.42
3 47.520000 NaN NaN NaN
4 49.760000 NaN NaN NaN
5 53.420000 NaN NaN NaN
如果没有可用的值,则假设您不想填充列1-3
。
答案 2 :(得分:0)
或者你可以试试这个
new=pd.concat([df['0'].shift(-x) for x in list(range(df.shape[1]))],axis=1)
new.columns=df.columns
new
Out[178]:
0 1 2 3 4
0 44.000000 42.200001 44.799999 47.52 49.76
1 42.200001 44.799999 47.520000 49.76 53.42
2 44.799999 47.520000 49.760000 53.42 NaN
3 47.520000 49.760000 53.420000 NaN NaN
4 49.760000 53.420000 NaN NaN NaN
5 53.420000 NaN NaN NaN NaN