Question

我有一个包含许多行和columuns的大型数据框。

结构的一个例子是：

a = np.random.rand(6,3)
df = pd.DataFrame(a)

我想将DataFrame拆分成单独的数据帧，每个数据帧由3行组成。

Answer 1

您可以使用numpy.split()方法：

In [8]: df = pd.DataFrame(np.random.rand(9, 3))

In [9]: df
Out[9]:
          0         1         2
0  0.899366  0.991035  0.775607
1  0.487495  0.250279  0.975094
2  0.819031  0.568612  0.903836
3  0.178399  0.555627  0.776856
4  0.498039  0.733224  0.151091
5  0.997894  0.018736  0.999259
6  0.345804  0.780016  0.363990
7  0.794417  0.518919  0.410270
8  0.649792  0.560184  0.054238

In [10]: for x in np.split(df, len(df)//3):
    ...:     print(x)
    ...:
          0         1         2
0  0.899366  0.991035  0.775607
1  0.487495  0.250279  0.975094
2  0.819031  0.568612  0.903836
          0         1         2
3  0.178399  0.555627  0.776856
4  0.498039  0.733224  0.151091
5  0.997894  0.018736  0.999259
          0         1         2
6  0.345804  0.780016  0.363990
7  0.794417  0.518919  0.410270
8  0.649792  0.560184  0.054238

Answer 2

您可以使用groupby

g = df.groupby(np.arange(len(df)) // 3)

for n, grp in g:
    print(grp)

          0         1         2
0  0.278735  0.609862  0.085823
1  0.836997  0.739635  0.866059
2  0.691271  0.377185  0.225146
          0         1         2
3  0.435280  0.700900  0.700946
4  0.796487  0.018688  0.700566
5  0.900749  0.764869  0.253200

把它变成一本方便的字典

mydict = {k: v for k, v in g}

将n行pandas数据帧放入它们自己的数据帧中

2 个答案: