Question

问题陈述

想在条件为另一个条件较低的另一列的df列之间执行str.contains

第一个希望看到1_Match或1_1_Match是或否，如果不是，则2_Match变得不适用
如果1_Match为是或1_1_Match为是，则要检查Country（欧盟）是否在Nation（欧洲）中/包含。如果是，则2_Match变为是
如果其中不包含Country（APAC）和Nation（印度）之间或部分不匹配，则2_Match会否

DF1

Country          Nation         1_Match   1_1_Match
EU               Europe         Yes       No
MA               MACOPEC        No        No
APAC             INDIA          Yes       No
COPEC            MACOPEC        No        Yes
COPEC            India          No        Yes

预期输出：

DF1

Country       Nation           1_Match       1_1_Match   2_Match
EU            Europe             Yes           No        Yes
MA            MACOPEC            No            No        Not Applicable
APAC          INDIA              Yes           No        No
COPEC         MACOPEC            No            Yes       Yes
Copec         India              No            Yes       No

代码（不起作用）：我正在为条件2＆3编写代码，但是它抛出错误，然后我也想容纳条件1

df1['2_Match']  = np.where(df1['Country'].str.strip().str.lower().str.contains(df1['Nation'].str.strip().str.lower().astype(str)),'Yes','No')

Answer 1

将numpy.select与in一起使用列表理解来检查列之间的子查询：

m1 = df['1_Match'] == 'No'
m2 = [c.lower() in n.lower() for c, n in zip(df['Country'], df['Nation'])]
masks = [m1, m2]
vals = ['Not Applicable','Yes']

df['2_Match'] = np.select(masks, vals, default='No')
print (df)
  Country   Nation 1_Match         2_Match
0      EU   Europe     Yes             Yes
1      MA  MACOPEC      No  Not Applicable
2    APAC    INDIA     Yes              No

编辑：

m1 = df['1_Match'] == 'No'
m2 = [c.lower() in n.lower() for c, n in zip(df['Country'], df['Nation'])]

m3 = df['1_1_Match'] == 'Yes'

masks = [m3, m1, m2]
vals = ['Yes', 'Not Applicable','Yes']

df['2_Match'] = np.select(masks, vals, default='No')
print (df)
  Country   Nation 1_Match 1_1_Match         2_Match
0      EU   Europe     Yes        No             Yes
1      MA  MACOPEC      No        No  Not Applicable
2    APAC    INDIA     Yes        No              No
3   COPEC  MACOPEC      No       Yes             Yes

编辑2：

masks = [m1 & ~m3, m2]
vals = ['Not Applicable','Yes']
print (df)
  Country   Nation 1_Match 1_1_Match         2_Match
0      EU   Europe     Yes        No             Yes
1      MA  MACOPEC      No        No  Not Applicable
2    APAC    INDIA     Yes        No              No
3   COPEC  MACOPEC      No       Yes             Yes
4   COPEC  India        No       Yes             No

将2列与另一列中的条件部分匹配

1 个答案: