Question

我在Pandas中有一些Python数据框，我想遍历它们以找出哪个数据框符合行的条件并将其保存在新的数据框中。

d = {'Count' : ['10', '11', '12', '13','13.4','12.5']}
df_1= pd.DataFrame(data=d)
df_1

d = {'Count' : ['10', '-11', '-12', '13','16','2']}
df_2= pd.DataFrame(data=d)
df_2

这是我要使用的逻辑，但是它不包含正确的语法，

for df in (df_1,df_2)
    if df['Count'][0] >0 and df['Count'][1] >0 and df['Count'][2]>0 and df['Count'][3]>0 
    and (df['Count'][4] is between df['Count'][3]+0.5 and df['Count'][3]-0.5) is True:
        df.save

正确的输出是 df_1 ... ，因为它符合我的条件。如何创建新的DataFrame或LIST来保存结果？

Answer 1

如果您有任何疑问，请告诉我。我对您的代码进行的主要更新是：

用.loc代替chained indexing
将您的前几个单独的and进行的比较合并为该系列的一部分的比较，并使用.all()降低为单个T / F

以下代码：

import pandas as pd 

# df_1 & df_2 input taken from you
d = {'Count' : ['10', '11', '12', '13','13.4','12.5']}
df_1= pd.DataFrame(data=d)

d = {'Count' : ['10', '-11', '-12', '13','16','2']}
df_2= pd.DataFrame(data=d)

# my solution here
df_1['Count'] = df_1['Count'].astype('float')
df_2['Count'] = df_2['Count'].astype('float')

my_dataframes = {'df_1': df_1, 'df_2': df_2}
good_dataframes = []
for df_name, df in my_dataframes.items():
    if (df.loc[0:3, 'Count'] > 0).all() and (df.loc[3,'Count']-0.5 <= df.loc[4, 'Count'] <= df.loc[3, 'Count']+0.5):
        good_dataframes.append(df_name)

good_dataframes_df = pd.DataFrame({'good': good_dataframes})

测试：

>>> print(good_dataframes_df)
   good
0  df_1

如何遍历多个数据帧以基于行标准选择一个数据帧？

1 个答案: