我正在尝试使用过滤器查找关键字之间的匹配百分比,并且在使用循环时难以获得正确的百分比结果。
这是我到目前为止尝试过的:
import pandas as pd
def percentmatch(component=[], manufacture=[]):
dummy = 0
for i in component:
if i in manufacture:
dummy += 1
requirements = len(component)
return (dummy/requirements)*100
def isDesired(innovator = [], manufacture = []):
for i in innovator:
if i in manufacture:
return True
return False
part = pd.read_csv("fakedata.csv")
#Change the Value for test case
part['Size'].iloc[5] = 'Startup'
manufacture = pd.read_csv("book1.csv")
#First filter if the manufacture wants to work with certain customer
criteria = []
for i, r in manufacture.iterrows():
criteria.append((isDesired([part['Size'].iloc[0]], r['Desired Customer**'].split(", "))))
manufacture['criteria'] = criteria
firstfilter = manufacture[criteria]
现在是第二个过滤器。
#Second filter if the manufacture can do certain phase. Ex: prototype, pre-release
criteria2 = []
for i, r in firstfilter.iterrows():
criteria2.append(isDesired([part['Phase'].iloc[0]], r['Preferred'].split(", ")))
firstfilter['criteria2'] = criteria2
secondfilter = firstfilter[criteria2]
#Third Filter to find the percent match in Methods
percentmatch1 = []
for i, r in secondfilter.iterrows():
print(r['Method'].split(", "))
print(part['Method'].iloc[0].split(", "))
# Indentation below is there, but refuses to show in S.O. for some reason
percentmatch1.append(percentmatch([part['Method'].iloc[0].split(", ")], r['Method'].split(",")))
# End of for loop is above, next line is on same level of indentation as for loop instantiation
secondfilter['Method match'] = percentmatch1
在上面的代码块中,我的输出是
['CNC Machining', '3D printing', 'Injection Molding']
['CNC Machining', '3D printing']
快速执行secondfilter.head()查找会给我以下内容:
secondfilter.head() output here
方法匹配应为100%,而不是0%。我该如何纠正?