Question

我无法获得正确的输出。我想每个熊猫单元格有多个值。

import pandas as pd

inp = {
    'test1':[i for i in range(20)],
    'test2':[i for i in range(20,40)]
    }

dfinp = pd.DataFrame(inp)
# print(dfinp)
out = {
    'test1':[],
    'test2':[]
        }

dfout = pd.DataFrame(out)

groups = 3

rang = range(groups,len(dfinp))

for i in rang:
    # print(dfinp['test2'][i-groups:i])
    dfout['test1']=dfout.test1.append(dfinp['test1'][i-groups:i],ignore_index=True)
    dfout['test2']=dfout.test2.append(dfinp['test2'][i-groups:i],ignore_index=True)

这就是我得到的

   test1  test2
0      0    NaN
1      1    NaN

这就是我想要的。每个单元格是一个数据系列或 numpy 数组。

     test1     test2
0   [0,1,2]   [20,21,22]
1   [1,2,3]   [21,22,23]
2   [2,3,4]   [22,23,24]
3   [3,4,5]   [23,24,25]
4   [4,5,6]   [24,25,26]
5   [5,6,7]   [25,26,27]
etc...

提前致谢。

Answer 1

试试：

def fn(df, col, N):
    df[col] = (
        pd.concat(
            [df[col].shift(-i) for i in range(N)],
            axis=1,
        )
        .dropna()
        .astype(int)
        .astype(str)
        .apply(np.array, axis=1)
    )
    return df[col]


N = 3

dfinp["test1"] = fn(dfinp, "test1", N)
dfinp["test2"] = fn(dfinp, "test2", N)

print(dfinp.dropna())

打印：

           test1         test2
0      [0, 1, 2]  [20, 21, 22]
1      [1, 2, 3]  [21, 22, 23]
2      [2, 3, 4]  [22, 23, 24]
3      [3, 4, 5]  [23, 24, 25]
4      [4, 5, 6]  [24, 25, 26]
5      [5, 6, 7]  [25, 26, 27]
6      [6, 7, 8]  [26, 27, 28]
7      [7, 8, 9]  [27, 28, 29]
8     [8, 9, 10]  [28, 29, 30]
9    [9, 10, 11]  [29, 30, 31]
10  [10, 11, 12]  [30, 31, 32]
11  [11, 12, 13]  [31, 32, 33]
12  [12, 13, 14]  [32, 33, 34]
13  [13, 14, 15]  [33, 34, 35]
14  [14, 15, 16]  [34, 35, 36]
15  [15, 16, 17]  [35, 36, 37]
16  [16, 17, 18]  [36, 37, 38]
17  [17, 18, 19]  [37, 38, 39]

一个熊猫单元中的熊猫数据系列

1 个答案: