我有一个尺寸为(28260,25)的数据框df 现在,我想将此数据帧分配到20个小数据帧中,每个小数据帧的尺寸为(1413,25),名称分别为df_1,df_2 .... df_20
例如: 输入数据框
frames={}
for e,i in enumerate(np.split(df,20)):
frames.update([('df'+str(e+1),pd.DataFrame(np.random.permutation(i),columns=df.columns))])
答案 0 :(得分:0)
如果要将所有数据框保留在 dict 中,这是一种方法:
# import modules
import pandas as pd
import numpy as np
# Create dataframe of 25 columns and 28260 rows
df = pd.DataFrame({"col_"+str(i): np.random.randint(0, 10, 28260)
for i in range(25)})
print(df.head(5))
# col_0 col_1 col_2 col_3 col_4 col_5 col_6 col_7 col_8 ... col_16 col_17 col_18 col_19 col_20 col_21 col_22 col_23 col_24
# 0 5 0 1 5 9 7 2 9 5 ... 5 1 3 8 2 3 9 7 4
# 1 7 1 5 0 2 1 5 9 6 ... 6 1 1 7 8 7 0 2 1
# 2 0 3 6 1 3 8 7 4 7 ... 9 9 7 7 8 9 1 6 9
# 3 7 7 3 3 3 1 3 4 9 ... 2 2 7 9 8 0 2 0 8
# 4 0 1 3 9 7 4 4 3 8 ... 9 5 8 4 5 4 3 9 6
print("Dimension df: ", df.shape)
# Dimension: (28260, 25)
# Create dict of sub dataframe
dict_df = {"df_"+str(i): df.iloc[i*28260//20:(i+1)*28260//20] for i in range(20)}
print("Keys: ", dict_df.keys())
# Keys: dict_keys(['df_0', 'df_1', 'df_2', 'df_3', 'df_4', 'df_5', 'df_6', 'df_7', 'df_8',
# 'df_9', 'df_10', 'df_11', 'df_12', 'df_13', 'df_14', 'df_15', 'df_16',
# 'df_17', 'df_18', 'df_19'])
print("Size of each sub_dataframe: ", dict_df["df_1"].shape)
# Size of each sub_dataframe: (1413, 25)
在列表中:
# List of sub dataframes
list_df = []
for i in range(20):
list_df.append(df.iloc[i*28260//20:(i+1)*28260//20])
print("Number of sub_dataframes: ", len(list_df))
# Number of sub_dataframes: 20
print("Size of each sub_dataframe: ", list_df[0].shape)
# Size of each sub_dataframe: (1413, 25)