所以我有一个数据集,如下所示:
# Example
0 1 2 3 4 5
0 18 1 -19 -16 -5 19
1 18 0 -19 -17 -6 19
2 17 -1 -20 -17 -6 19
3 18 1 -19 -16 -5 20
4 18 0 -19 -16 -5 20
实际数据:
[{0: 18, 1: 1, 2: -19, 3: -16, 4: -5, 5: 19},
{0: 18, 1: 0, 2: -19, 3: -17, 4: -6, 5: 19},
{0: 17, 1: -1, 2: -20, 3: -17, 4: -6, 5: 19},
{0: 18, 1: 1, 2: -19, 3: -16, 4: -5, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -16, 4: -5, 5: 20},
{0: 18, 1: 0, 2: -20, 3: -15, 4: -4, 5: 20},
{0: 19, 1: 1, 2: -18, 3: -16, 4: -5, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -17, 4: -7, 5: 18},
{0: 18, 1: 0, 2: -20, 3: -18, 4: -7, 5: 18},
{0: 17, 1: 0, 2: -19, 3: -17, 4: -7, 5: 18},
{0: 18, 1: 0, 2: -19, 3: -16, 4: -4, 5: 20},
{0: 18, 1: 1, 2: -19, 3: -16, 4: -5, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -16, 4: -4, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -16, 4: -5, 5: 20},
{0: 18, 1: 1, 2: -18, 3: -16, 4: -5, 5: 20},
{0: 17, 1: 0, 2: -20, 3: -16, 4: -5, 5: 19},
{0: 17, 1: 0, 2: -19, 3: -16, 4: -4, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -15, 4: -4, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -14, 4: -3, 5: 22},
{0: 18, 1: 1, 2: -18, 3: -14, 4: -4, 5: 22}]
以上内容的形状为:(20, 6)
。
我想要实现的是一次将自定义函数应用于4行上的每一列。
示例:
f()
适用于所有列的 df.ix[0:3]
; f()
适用于所有列的 df.ix[4:7]
; 以此类推...
某种程度上,我需要滚动4步长的4号窗口。
使用上述数据时, 结果将是以下形状的数据框:(5, 6)
。仅出于论证的目的,您可以假定自定义函数将每一列取这4行的平均值。
到目前为止我尝试了什么?
代码如下:
curr = 0
res = []
while curr < df_to_look_at2.shape[0]:
look_at = df_to_look_at2.ix[curr:curr+3]
curr += 4
res.append(look_at.mean().values.tolist())
pd.DataFrame(res)
和结果:
0 1 2 3 4 5
0 17.75 0.25 -19.25 -16.50 -5.50 19.25
1 18.25 0.25 -19.00 -16.00 -5.25 19.50
2 17.75 0.25 -19.25 -16.75 -5.75 19.00
3 17.75 0.25 -19.00 -16.00 -4.75 19.75
4 17.75 0.25 -18.75 -14.75 -3.75 21.00
还有一个想法,如果它不仅要取均值,还要取min(),max(),mean()和其他一些自定义函数...
答案 0 :(得分:1)
如果您要在一个以上的窗口中考虑多个行,则滚动在此处是准确的。但是,您的窗户是唯一的,所以您真正要问的是如何按照步幅分组,您可以使用def drawGrid():
for x in range(0, WINDOWWIDTH, CELLSIZE):
pygame.draw.line(DISPLAYSURF, DARKGRAY, (x, 0) (x, WINDOWHEIGHT))
for y in range(0, WINDOWHEIGHT, CELLSIZE):
pygame.draw.line(DISPLAYSURF, DARKGRAY, (0, y) (WINDOWWIDTH, y))
和楼层划分来完成。
arange
window_size = 4
grouper = np.arange(df.shape[0]) // window_size
df.groupby(grouper).mean()
答案 1 :(得分:1)
我认为以这种方式进行的多次计算实际上属于numpy草皮。您可以使用整形来获得所需格式的基础数组,然后根据需要在数组上进行计算。
inp = [{0: 18, 1: 1, 2: -19, 3: -16, 4: -5, 5: 19},
{0: 18, 1: 0, 2: -19, 3: -17, 4: -6, 5: 19},
{0: 17, 1: -1, 2: -20, 3: -17, 4: -6, 5: 19},
{0: 18, 1: 1, 2: -19, 3: -16, 4: -5, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -16, 4: -5, 5: 20},
{0: 18, 1: 0, 2: -20, 3: -15, 4: -4, 5: 20},
{0: 19, 1: 1, 2: -18, 3: -16, 4: -5, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -17, 4: -7, 5: 18},
{0: 18, 1: 0, 2: -20, 3: -18, 4: -7, 5: 18},
{0: 17, 1: 0, 2: -19, 3: -17, 4: -7, 5: 18},
{0: 18, 1: 0, 2: -19, 3: -16, 4: -4, 5: 20},
{0: 18, 1: 1, 2: -19, 3: -16, 4: -5, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -16, 4: -4, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -16, 4: -5, 5: 20},
{0: 18, 1: 1, 2: -18, 3: -16, 4: -5, 5: 20},
{0: 17, 1: 0, 2: -20, 3: -16, 4: -5, 5: 19},
{0: 17, 1: 0, 2: -19, 3: -16, 4: -4, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -15, 4: -4, 5: 20},
{0: 18, 1: 0, 2: -19, 3: -14, 4: -3, 5: 22},
{0: 18, 1: 1, 2: -18, 3: -14, 4: -4, 5: 22}]
import pandas as pd
df = pd.DataFrame(inp)
temp = df.values.reshape(-1, 4, df.shape[-1])
out = pd.DataFrame(temp.mean(axis=1))
输出:
0 1 2 3 4 5
0 17.75 0.25 -19.25 -16.50 -5.50 19.25
1 18.25 0.25 -19.00 -16.00 -5.25 19.50
2 17.75 0.25 -19.25 -16.75 -5.75 19.00
3 17.75 0.25 -19.00 -16.00 -4.75 19.75
4 17.75 0.25 -18.75 -14.75 -3.75 21.00