Question

我已经成功地将DataFrame分成了几个较小的DataFrames。我正在努力提供这些DataFrames顺序名称，并且可以独立调用。

shuffled = df.sample(frac=1)
result = np.array_split(shuffled, 3) 

for part in result:
    print(part, '\n')

movie_id  1  2  5  borda  rank  IRAM
2         3  4  0  0      4     3     2
1         2  3  0  3      6     2     1 

   movie_id  1  2  5  borda  rank  IRAM
4         5  3  0  0      3     4     3
0         1  5  4  4     13     1     4 

   movie_id  1  2  5  borda  rank  IRAM
3         4  3  0  0      3     4     3

我想使用循环（或任何有用的方法）为这些分开的DataFrames顺序命名。

例如：

df_1
   movie_id  1  2  5  borda  rank  IRAM
2         3  4  0  0      4     3     2
1         2  3  0  3      6     2     1 

df_2
   movie_id  1  2  5  borda  rank  IRAM
4         5  3  0  0      3     4     3
0         1  5  4  4     13     1     4 

df_3
   movie_id  1  2  5  borda  rank  IRAM
3         4  3  0  0      3     4     3

我已经寻找解决方案已有一段时间了，但是找不到理想的答案。

Answer 1

df_dict = {}
for index, splited in enumerate(result):
    df_name = "df_{}".format(index)
    # if you want to set name of the dataframe
    splited.name = df_name
    # if you want to set the variable name to dataframe
    df_dict[df_name] = splited
print(df_dict)

{'df_0':    movie_id  1  2  4  5  6  7  8  9  10  11  12  borda
 9        10  3  2  0  0  0  4  0  0   0   0   0      9
 7         8  1  0  0  0  4  5  0  0   0   4   0     14
 6         7  4  0  0  0  2  5  3  4   4   0   0     22
 0         1  5  4  0  4  4  0  0  0   4   0   0     21,
 'df_1':    movie_id  1  2  4  5  6  7  8  9  10  11  12  borda
 8         9  5  0  0  0  4  5  0  0   4   5   0     23
 3         4  3  0  0  0  0  5  0  0   4   0   5     17
 5         6  5  0  0  0  0  0  0  5   0   0   0     10,
 'df_2':    movie_id  1  2  4  5  6  7  8  9  10  11  12  borda
 4         5  3  0  0  0  0  0  0  0   0   0   0      3
 2         3  4  0  0  0  0  0  0  0   0   0   0      4
 1         2  3  0  0  3  0  0  0  0   0   0   0      6}

然后您可以通过splited_df呼叫任何df_dict[df_name]。

Answer 2

这可以通过使用字典并将所有数据帧添加到其中来完成：

df = pd.DataFrame({'Col1': np.random.randint(10, size=10)})
shuffled = df.sample(frac=1)
result = np.array_split(shuffled, 3) 
d = {}
for i, part in enumerate(result):
    d['df_'+str(i)] = part          # If want to start the number for df from 1 then use str(i+1)

print(d['df_0'])
   Col1
7     7
6     0
4     5
2     3

print(d['df_1'])
   Col1
0     0
8     1
1     5

print(d['df_2'])
   Col1
5     2
3     2
9     4

Answer 3

您可以像这样使用字典：

d = {"df_"+str(k):v for (k,v) in [(i,result[i]) for i in range(len(result))]}

如何使用循环为数据帧赋予顺序名称？

3 个答案: