例如,我有一个包含100个数据帧的列表,其中一些的列长为8,其他的列长为10,其他的12。我希望能够根据它们的列长将它们分为几组。我已经尝试过字典,但是无法使其正确地循环添加。
以前尝试过的代码:
col_count = [8, 10, 12]
d = dict.fromkeys(col_count, [])
for df in df_lst:
for i in col_count:
if i == len(df.columns):
d[i] = df
但这似乎只是每次替换dict中的值。我也尝试过.append,但这似乎会附加到所有键上。
答案 0 :(得分:0)
不是将df
分配给d[column_count]
。您应该附加它。
您使用d = dict.fromkeys(col_count, [])
初始化了d,因此d
是一个空列表的字典。
当您执行d[i] = df
时,将空列表替换为DataFrame,因此d
将是DataFrame的字典。如果您执行d[i].append(df)
,则将有一个DataFrame列表字典。 (这就是您想要的AFAIU)
我也不确定您是否需要col_count
变量。您可以只做d[len(df.columns)].append(df)
。
答案 1 :(得分:0)
我认为这足以满足您的要求。考虑如何动态解决问题,以更好地利用Python。
In [2]: import pandas as pd
In [3]: for i in range(1, 5):
...: exec(f"df{i} = pd.DataFrame(0, index=range({i}), columns=list('ABCD'))") #making my own testing list of dataframes with variable length
...:
In [4]: df1 #one row df
Out[4]:
A B C D
0 0 0 0 0
In [5]: df2 #two row df
Out[5]:
A B C D
0 0 0 0 0
1 0 0 0 0
In [6]: df3 #three row df
Out[6]:
A B C D
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0
In [7]: L = [df1, df2, df3, df4, df5] #i assume all your dataframes are put into something like a container, which is the problem
In [13]: my_3_length_shape_dfs = [] #you need to create some sort of containers for your lengths (you can do an additional exec in the following In
In [14]: for i in L:
...: if i.shape[0] == 3: #add more of these if needed, you mentioned your lengths are known [8, 10, 12]
...: my_3_length_shape_dfs.append(i) #adding the df to a specified container, thus grouping any dfs that are of row length/shape equal to 3
...: print(i)
...:
A B C D
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0