用Python中的字符串匹配替换值?

时间:2020-07-23 06:59:57

标签: python python-3.x pandas python-2.7

df:

SQL> alter session set optimizer_index_cost_adj=20 ;

Session altered.

SQL> set autotrace traceonly explain
SQL> select c3, count(*) from cpl_rep.test_index group by c3 ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2605845939

-----------------------------------------------------------------------------------
| Id  | Operation        | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                |     2 |     4 |    57  (13)| 00:00:01 |
|   1 |  HASH GROUP BY   |                |     2 |     4 |    57  (13)| 00:00:01 |
|   2 |   INDEX FULL SCAN| IDX_TEST_INDEX |   200K|   390K|    50   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

输出:

Item_Key   Item_Name   Delivery_Partner    Ex_Delivery_Partner
10001      Brush       zen                 zen and company
10002      sandle      randv               randv ltd
10003      chemical    ABC                 zen and company
10004      chair       hank                hank solution
10005      shoes       zen                 tour and zen ltd
10006      shoes       XYZ                 tour and zen ltd
10007      ropes       delv.com            delv.com
10008      ropes       hans ltd            delv.com and Enterprises

我需要通过检查“ Ex_Delivery_Partner”中的名称来替换“ Delivery_Partner”名称。 如果没有匹配的字符串,则将其替换为“ Ex_Delivery_Partner”。

1 个答案:

答案 0 :(得分:2)

通过DataFrame.apply中的in语句测试两列之间的成员资格或列出理解,然后通过Series.whereDataFrame.loc设置新值:

mask = df.apply(lambda x: x['Delivery_Partner'] in x['Ex_Delivery_Partner'], 1)

df['Delivery_Partner'] = df['Delivery_Partner'].where(mask, df['Ex_Delivery_Partner'])
print (df)
   Item_Key Item_Name  Delivery_Partner Ex_Delivery_Partner
0     10001     Brush               zen     zen and company
1     10002    sandle             randv           randv ltd
2     10003  chemical   zen and company     zen and company
3     10004     chair              hank       hank solution
4     10005     shoes               zen    tour and zen ltd
5     10006     shoes  tour and zen ltd    tour and zen ltd
6     10007     ropes          delv.com            delv.com

mask = [x not in y for x, y in df[['Delivery_Partner', 'Ex_Delivery_Partner']].to_numpy()]

df.loc[mask, 'Delivery_Partner'] = df['Ex_Delivery_Partner']
print (df)
   Item_Key Item_Name  Delivery_Partner Ex_Delivery_Partner
0     10001     Brush               zen     zen and company
1     10002    sandle             randv           randv ltd
2     10003  chemical   zen and company     zen and company
3     10004     chair              hank       hank solution
4     10005     shoes               zen    tour and zen ltd
5     10006     shoes  tour and zen ltd    tour and zen ltd
6     10007     ropes          delv.com            delv.com