我写了一些代码来匹配我的电视名称。我从df中仅获取了一行,应该匹配但不匹配,以检查我的代码出了什么问题:
Data = {'name': ['LG 43UJ634V'],
'comp_name': ['LG 43UJ634V'],
'manufacturer': ['LG'],
'comp_manufacturer': [''],
'category': ['TVs']
}
df = pd.DataFrame(Data, columns = ['name','comp_name', 'manufacturer', 'comp_manufacturer'])
我匹配这些项目的代码在这里:
our_name =df['name'].iloc[0].lower()
comp_name = df['comp_name'].iloc[0].lower()
brand = df['manufacturer'].iloc[0].lower()
comp_brand = df['comp_manufacturer'].iloc[0].lower()
print('Our name:', our_name)
print('Comp name:', comp_name)
print('Brand:', brand)
print('Comp_brand:', comp_brand)
our_name = our_name.replace(brand, '').strip()
our_name = our_name.replace(comp_brand, '').strip()
print('Our name after brand removal:', our_name)
splitOurName = our_name.split(' ')
print('Our name split:', splitOurName)
counter= 0
for j in splitOurName:
if j in comp_name:
counter = counter + 1
print('counter:', counter)
if counter == len(splitOurName):
if ((len(our_name.split(' ')) == 1 and our_name.isalpha()) or
(len(comp_name.split(' ')) == 1 and comp_name.isalpha()) or
len(our_name) <= 4):
print('No match')
else:
print('Perfect match')
这个想法是,我只需要检查我们名字中字母和数字的唯一组合,就不需要别的了(没有符号,没有其他单词,没有品牌等)。如果这种组合是竞争对手的名字,那么我可以说这是一场比赛。我使用一些计数器检查此值,该计数器检查是否在竞争对手名称中找到了我们名字中所有剩余的字符串(在这种情况下,它只是一个字符串,但是我的原始数据框具有许多名称,经过所有更正后,它们具有多个字符串)。如果是这样,那是一场比赛。因此,当前代码会打印出“完全匹配”。但是,如果我用“ return True”和“ return False”分别替换最后两个打印,当我将其作为测试数据框的函数调用时,得到“ NaN”(完全相同)。我在这里看不到什么问题?
更新:
我更新了数据框,这就是我调用适用于测试数据框的函数的方式:
df.loc[df.category.isin(['TVs']), 'match'] = df.loc[df.category.isin(['TVs'])].apply(tv_match, axis=1)
预期结果: df ['match'] ==真
结果我得到: df ['match'] == NaN
答案 0 :(得分:0)
以下代码对我来说效果很好。您确定调用正确吗?
import pandas as pd
Data = {'name': ['LG 43UJ634V'],
'comp_name': ['LG 43UJ634V'],
'manufacturer': ['LG'],
'comp_manufacturer': ['']
}
df = pd.DataFrame(Data, columns = ['name','comp_name', 'manufacturer', 'comp_manufacturer'])
def sample():
our_name =df['name'].iloc[0].lower()
comp_name = df['comp_name'].iloc[0].lower()
brand = df['manufacturer'].iloc[0].lower()
comp_brand = df['comp_manufacturer'].iloc[0].lower()
print('Our name:', our_name)
print('Comp name:', comp_name)
print('Brand:', brand)
print('Comp_brand:', comp_brand)
our_name = our_name.replace(brand, '').strip()
our_name = our_name.replace(comp_brand, '').strip()
print('Our name after brand removal:', our_name)
splitOurName = our_name.split(' ')
print('Our name split:', splitOurName)
counter= 0
for j in splitOurName:
if j in comp_name:
counter = counter + 1
print('counter:', counter)
if counter == len(splitOurName):
if ((len(our_name.split(' ')) == 1 and our_name.isalpha()) or
(len(comp_name.split(' ')) == 1 and comp_name.isalpha()) or
len(our_name) <= 4):
return True
else:
return False
print(sample())