我无法获得正确的输出。 我想每个熊猫单元格有多个值。
import pandas as pd
inp = {
'test1':[i for i in range(20)],
'test2':[i for i in range(20,40)]
}
dfinp = pd.DataFrame(inp)
# print(dfinp)
out = {
'test1':[],
'test2':[]
}
dfout = pd.DataFrame(out)
groups = 3
rang = range(groups,len(dfinp))
for i in rang:
# print(dfinp['test2'][i-groups:i])
dfout['test1']=dfout.test1.append(dfinp['test1'][i-groups:i],ignore_index=True)
dfout['test2']=dfout.test2.append(dfinp['test2'][i-groups:i],ignore_index=True)
这就是我得到的
test1 test2
0 0 NaN
1 1 NaN
这就是我想要的。每个单元格是一个数据系列或 numpy 数组。
test1 test2
0 [0,1,2] [20,21,22]
1 [1,2,3] [21,22,23]
2 [2,3,4] [22,23,24]
3 [3,4,5] [23,24,25]
4 [4,5,6] [24,25,26]
5 [5,6,7] [25,26,27]
etc...
提前致谢。
答案 0 :(得分:0)
试试:
def fn(df, col, N):
df[col] = (
pd.concat(
[df[col].shift(-i) for i in range(N)],
axis=1,
)
.dropna()
.astype(int)
.astype(str)
.apply(np.array, axis=1)
)
return df[col]
N = 3
dfinp["test1"] = fn(dfinp, "test1", N)
dfinp["test2"] = fn(dfinp, "test2", N)
print(dfinp.dropna())
打印:
test1 test2
0 [0, 1, 2] [20, 21, 22]
1 [1, 2, 3] [21, 22, 23]
2 [2, 3, 4] [22, 23, 24]
3 [3, 4, 5] [23, 24, 25]
4 [4, 5, 6] [24, 25, 26]
5 [5, 6, 7] [25, 26, 27]
6 [6, 7, 8] [26, 27, 28]
7 [7, 8, 9] [27, 28, 29]
8 [8, 9, 10] [28, 29, 30]
9 [9, 10, 11] [29, 30, 31]
10 [10, 11, 12] [30, 31, 32]
11 [11, 12, 13] [31, 32, 33]
12 [12, 13, 14] [32, 33, 34]
13 [13, 14, 15] [33, 34, 35]
14 [14, 15, 16] [34, 35, 36]
15 [15, 16, 17] [35, 36, 37]
16 [16, 17, 18] [36, 37, 38]
17 [17, 18, 19] [37, 38, 39]