数据框的条件语句

时间:2019-08-15 19:06:36

标签: python-3.x dataframe if-statement conditional-statements

我有一个数据框,如下所示。 Dataframe start

我想查看D,F,M,P列,并返回结果列,该列的值在每一行中显示最多。

我要确保遵循以下规则的规则是: 1)如果包含2个IG和2个HY的行之间有分隔,请在Result列中返回HY。
2)如果列包含NaN值,请忽略它并使用其他可用值。

我希望生成的数据框看起来像: Result_DF

df_Start = pd.DataFrame({'P':['IG','HY','IG',np.nan,'HY'], 'M':['HY','HY','IG', np.nan,'IG'], 'F':['HY',np.nan,'HY', np.nan,'IG'],'D':['IG','IG','IG', 'HY','IG']})

df_end = pd.DataFrame({'Result':['HY','HY','IG', 'HY','IG'],'P':['IG','HY','IG',np.nan,'HY'], 'M':['HY','HY','IG', np.nan,'IG'], 'F':['HY',np.nan,'HY', np.nan,'IG'],'D':['IG','IG','IG', 'HY','IG']})




def f(x):

frequencies = pd.Series(data=[y for y in x if pd.isnull(y)==False]).value_counts()
a,b,c = 0,0,0
if 'IG' in frequencies:
    b = frequencies['IG']
if 'HY' in frequencies:
    a = frequencies['HY']
if 'PFA' in frequencies: 
    c = frequencies['PFA']
return 'PFA' if c > 0 elif 

对于我,在new_df.iterrows()中行:     new_df.loc [i,'result'] = f(row)

1 个答案:

答案 0 :(得分:0)

尝试一下,让我知道它是否有效

def f(x):
    frequencies = pd.Series(data=[y for y in x if np.isnan(y)==False]).value_counts()

    a,b = frequencies['HY'],frequencies['IG']


    return 'HY' if a>=b else 'IG'
df['result'] = df.columns[['D','F','M','P']].apply(lambda x: f(x))


我现在无法弄清楚上述方法为何行不通


def f(x):

    frequencies = pd.Series(data=[y for y in x if pd.isnull(y)==False]).value_counts()
    a,b,c = 0,0,0
    if 'IG' in frequencies:
        b = frequencies['IG']
    if 'HY' in frequencies:
        a = frequencies['HY']
    if 'PFA' in frequencies:
        c = frequencies['PFA']
    if c>=1:
        return 'PFA'
    else:
        return 'HY' if a>=b else 'IG'


for i,row in df_Start.iterrows():
    df_Start.loc[i,'result'] = f(row)

新的应该可以。