目的是通过使用产品标题检查产品目录中的产品可用性。
输入:
1.我从卖方那里获得产品标题,并且他们的产品标题被Array1之类的空白分割为多个项目
2.我在数据库中拥有自己的产品目录,并且产品标题被Array2之类的空白分割为多个项目
输出:
要回答我的目录中是否有卖方的产品标题。像这样:
Product 1 from seller -> product A from catalog -> 85% match
Product 2 from seller -> product B from catalog -> 25% match
Product 3 from seller -> product C from catalog -> 45% match
...
Product n from seller -> product D from catalog -> 73% match
思考过程和匹配百分比计算:
1。将Array1中的每个项目循环通过Array2中的每个数组。
2。如果Array1中的项目在Array2中的数组中可用,则分配值= 1,否则,分配值= 0
3。在此过程中忽略NaN值
4。 A =计算Array1中每个数组中有多少value = 1
5。 B =计算每个产品中有多少个项目
6。 Array1中每个产品的匹配百分比= A / B
7。如果单个产品的匹配百分比可能相同,请采用第一个循环的匹配百分比。
示例:
假设我有2个来自卖方array1_example的产品,从Array1切割而来,而我的产品目录array2_example,是从Array2切割的,
array1_example([['Black', 'Pen','NaN']],
['Yellow', 'Pen','New']], dtype=object)
array2_example([['Black', 'Book', 'Big'],
['Yellow', 'Pen', 'Small'],
['White', 'notebook', 'Medium']],dtype=object)
Output_looping_1 [1, 0], # there is 'Black' and no 'Pen' in ['Black', 'Book', 'Big'] # 1/2=50%
[0, 1], # there is no 'Black' and there is 'Pen' in 'Yellow', 'Pen', 'Small' # 1/2=50%
[0, 0], # there is no 'Black' and no 'Pen' in ['White', 'notebook', 'Medium'] # 0/2=0%
Output_looping_2 [0,0,0] #check ['Yellow', 'Pen','New'] in ['Black', 'Book', 'Big'] 0/3=0%
[1,1,0], #check ['Yellow', 'Pen','New'] in ['Yellow', 'Pen', 'Small'] 2/3=67%
[0,0,0], #check ['Yellow', 'Pen','New'] in ['White', 'notebook', 'Medium'] 0/3=0%
Output_final ['Black', 'Pen','NaN'] -> ['Black', 'Book', 'Big'] -> 50% match
['Yellow', 'Pen','New']-> ['Yellow','Pen','Small'] -> 67% match
Array1:
array([['Mai', 'Dubai', '200ml', ..., 'NaN', 'NaN', 'NaN'],
['Mai', 'Dubai', 'Cup', ..., 'NaN', 'NaN', 'NaN'],
['Mai', 'Dubai', '1.5', ..., 'NaN', 'NaN', 'NaN'],
...,
['Seachem', 'Aquavitro', 'Alpha', ..., 'NaN', 'NaN', 'NaN'],
['SEACHEM', 'AQUAVITRO', 'VIBRANCE', ..., 'NaN', 'NaN', 'NaN'],
['SEACHEM', 'AQUAVITRO', 'CALCIFICATI', ..., 'NaN', 'NaN', 'NaN']],
dtype=object)
Array2:
array([['2-Piece', 'Glitzi', 'Power', ..., 'NaN', 'NaN', 'NaN'],
['15-Piece', 'Bones', 'For', ..., 'NaN', 'NaN', 'NaN'],
['Sliced', 'Beets', '425', ..., 'NaN', 'NaN', 'NaN'],
...,
['Cookies', 'With', 'Hazelnut', ..., 'NaN', 'NaN', 'NaN'],
['Apple', 'Kale', 'Avocado', ..., 'NaN', 'NaN', 'NaN'],
['Selection', 'Of', 'Six', ..., 'NaN', 'NaN', 'NaN']], dtype=object)
请给我一些指导以解决此问题。谢谢!