Question

我正在尝试从pandas列中创建前10个值的向量，并将其作为单元格中的列表重新插入pandas数据框中。

下面的代码可以工作，但是我需要对超过3000万行的数据帧执行此操作，因此要花很长时间才能循环执行。

有人可以帮助我将其转换为我可以应用的numpy函数。我还希望能够在groupby中应用此功能。

import pandas as pd

df = pd.DataFrame(list(range(1,20)),columns = ['A'])

df.insert(0,'Vector','')
df['Vector'] = df['Vector'].astype(object)

for index, row in df.iterrows():
     df['Vector'].iloc[index] = list(df['A'].iloc[(index-10):index])

我已经尝试了多种方法，但是无法使其正常工作。任何帮助将不胜感激。

Answer 1

IIUC

df['New']=[df.A.tolist()[max(0,x-10):x] for x in range(len(df))]
df
Out[123]: 
     A                                      New
0    1                                       []
1    2                                      [1]
2    3                                   [1, 2]
3    4                                [1, 2, 3]
4    5                             [1, 2, 3, 4]
5    6                          [1, 2, 3, 4, 5]
6    7                       [1, 2, 3, 4, 5, 6]
7    8                    [1, 2, 3, 4, 5, 6, 7]
8    9                 [1, 2, 3, 4, 5, 6, 7, 8]
9   10              [1, 2, 3, 4, 5, 6, 7, 8, 9]
10  11          [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
11  12         [2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
12  13        [3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
13  14       [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
14  15      [5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
15  16     [6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
16  17    [7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
17  18   [8, 9, 10, 11, 12, 13, 14, 15, 16, 17]
18  19  [9, 10, 11, 12, 13, 14, 15, 16, 17, 18]

为每个熊猫行创建一个包含下10个行列值的向量

1 个答案: