问题陈述
想在条件为另一个条件较低的另一列的df列之间执行str.contains
1_Match
或1_1_Match
是或否,如果不是,则2_Match
变得不适用 Country
(欧盟)是否在Nation
(欧洲)中/包含。如果是,则2_Match
变为是 Country
(APAC)和Nation
(印度)之间或部分不匹配,则2_Match
会否 DF1
Country Nation 1_Match 1_1_Match
EU Europe Yes No
MA MACOPEC No No
APAC INDIA Yes No
COPEC MACOPEC No Yes
COPEC India No Yes
预期输出:
DF1
Country Nation 1_Match 1_1_Match 2_Match
EU Europe Yes No Yes
MA MACOPEC No No Not Applicable
APAC INDIA Yes No No
COPEC MACOPEC No Yes Yes
Copec India No Yes No
代码(不起作用):我正在为条件2&3编写代码,但是它抛出错误,然后我也想容纳条件1
df1['2_Match'] = np.where(df1['Country'].str.strip().str.lower().str.contains(df1['Nation'].str.strip().str.lower().astype(str)),'Yes','No')
答案 0 :(得分:1)
将numpy.select
与in
一起使用列表理解来检查列之间的子查询:
m1 = df['1_Match'] == 'No'
m2 = [c.lower() in n.lower() for c, n in zip(df['Country'], df['Nation'])]
masks = [m1, m2]
vals = ['Not Applicable','Yes']
df['2_Match'] = np.select(masks, vals, default='No')
print (df)
Country Nation 1_Match 2_Match
0 EU Europe Yes Yes
1 MA MACOPEC No Not Applicable
2 APAC INDIA Yes No
编辑:
m1 = df['1_Match'] == 'No'
m2 = [c.lower() in n.lower() for c, n in zip(df['Country'], df['Nation'])]
m3 = df['1_1_Match'] == 'Yes'
masks = [m3, m1, m2]
vals = ['Yes', 'Not Applicable','Yes']
df['2_Match'] = np.select(masks, vals, default='No')
print (df)
Country Nation 1_Match 1_1_Match 2_Match
0 EU Europe Yes No Yes
1 MA MACOPEC No No Not Applicable
2 APAC INDIA Yes No No
3 COPEC MACOPEC No Yes Yes
编辑2:
masks = [m1 & ~m3, m2]
vals = ['Not Applicable','Yes']
print (df)
Country Nation 1_Match 1_1_Match 2_Match
0 EU Europe Yes No Yes
1 MA MACOPEC No No Not Applicable
2 APAC INDIA Yes No No
3 COPEC MACOPEC No Yes Yes
4 COPEC India No Yes No