通过上面的代码进行分析：

Question

我有一个excel-1（原始数据）和excel-2（参考文档）

在excel-1中，“ Comments”应与excel-2“ Comments”列相匹配。如果excel-1“ comments”列中的字符串包含excel-2“ comments”列中的任何子字符串，则Primary excel-1中的每一行都应填充excel-2的原因和次要原因。

Excel-1 {'Item'：{0：'rr-1'，1：'ss-2'}，'Order'：{0：1，1：2}，'Comments'：{0：'Good;缺货订单，＃1237-MF，结束库存'，1：'没有变化，不好，下周发货，09/12 / 2018-MF *'}}

Excel-2 {“评论”：{0：“良好”，1：“库存无序”，2：“库存已关闭”，3：“无变化”，4：“库存不足”，5：“下周交货”} ，“主要原因”：{0：“质量”，1：“仓库”，2：“物流”，3：“反馈”，4：“仓库”，5：“物流”}，“次要原因”：{ 0：“制造”，1：“库存”，2：“仓库”，3：“反馈”，4：“库存”，5：“仓库”}}

请帮助建立逻辑。

当使用pd.dataframe.str.contains / isin函数进行单个匹配时，我得到了答案，但是如何编写逻辑来搜索多个匹配并以特定的结构格式编写。

for value in df['Comments']:
    string = re.sub(r'[?|$|.|!|,|;]',r'',value)
    for index,value in df1.iterrows():
        substring = df1.Comment[index]
        if substring in string:
            df['Primary Reason']= df1['Primary Reason'][index]
            df['Secondary Reason']=df1['Secondary Reason'][index]

Answer 1

获取df ['Comments']中的值：

string = re.sub(r'[?|$|.|!|,|;]',r'',value)

for index,value in df1.iterrows():

    substring = df1.Comment[index]

    if substring in string:

        df['Primary Reason']= df1['Primary Reason'][index]

        df['Secondary Reason']=df1['Secondary Reason'][index]

通过上面的代码进行分析：

基本上，您正在比较excel-1的row1和excel-2的row-1，并匹配子字符串和字符串，并使“主要和次要原因”正确吗？
在这里，您覆盖相同的位置，即o / p位置，因此，您最终只能得到1个结果。

问题在以下代码中：

df['Primary Reason']= df1['Primary Reason'][index]

df['Secondary Reason']=df1['Secondary Reason'][index]

使用逻辑，您可以在与以下格式相同的行中累加结果

res1，res2 .... etc

来自不同列的Python Pandas字符串匹配

1 个答案:

通过上面的代码进行分析：