Question

例如，我有一个包含100个数据帧的列表，其中一些的列长为8，其他的列长为10，其他的12。我希望能够根据它们的列长将它们分为几组。我已经尝试过字典，但是无法使其正确地循环添加。

以前尝试过的代码：

col_count = [8, 10, 12]

d = dict.fromkeys(col_count, [])

for df in df_lst:
    for i in col_count:
        if i == len(df.columns):
            d[i] = df

但这似乎只是每次替换dict中的值。我也尝试过.append，但这似乎会附加到所有键上。

Answer 1

不是将df分配给d[column_count]。您应该附加它。

您使用d = dict.fromkeys(col_count, [])初始化了d，因此d是一个空列表的字典。

当您执行d[i] = df时，将空列表替换为DataFrame，因此d将是DataFrame的字典。如果您执行d[i].append(df)，则将有一个DataFrame列表字典。（这就是您想要的AFAIU）

我也不确定您是否需要col_count变量。您可以只做d[len(df.columns)].append(df)。

Answer 2

我认为这足以满足您的要求。考虑如何动态解决问题，以更好地利用Python。

In [2]: import pandas as pd

In [3]: for i in range(1, 5):
   ...:     exec(f"df{i} = pd.DataFrame(0, index=range({i}), columns=list('ABCD'))") #making my own testing list of dataframes with variable length
   ...:

In [4]: df1 #one row df
Out[4]:
   A  B  C  D
0  0  0  0  0

In [5]: df2 #two row df
Out[5]:
   A  B  C  D
0  0  0  0  0
1  0  0  0  0

In [6]: df3 #three row df
Out[6]:
   A  B  C  D
0  0  0  0  0
1  0  0  0  0
2  0  0  0  0

In [7]: L = [df1, df2, df3, df4, df5] #i assume all your dataframes are put into something like a container, which is the problem

In [13]: my_3_length_shape_dfs = [] #you need to create some sort of containers for your lengths (you can do an additional exec in the following In

In [14]: for i in L:
    ...:     if i.shape[0] == 3: #add more of these if needed, you mentioned your lengths are known [8, 10, 12]
    ...:         my_3_length_shape_dfs.append(i) #adding the df to a specified container, thus grouping any dfs that are of row length/shape equal to 3
    ...:         print(i)
    ...:
   A  B  C  D
0  0  0  0  0
1  0  0  0  0
2  0  0  0  0

如何按每个数据帧的长度拆分/分组数据帧列表

2 个答案: