按产品标题搜索产品

时间:2019-12-26 10:56:29

标签: python arrays pandas

目的是通过使用产品标题检查产品目录中的产品可用性。

输入:
1.我从卖方那里获得产品标题,并且他们的产品标题被Array1之类的空白分割为多个项目
2.我在数据库中拥有自己的产品目录,并且产品标题被Array2之类的空白分割为多个项目

输出:
要回答我的目录中是否有卖方的产品标题。像这样:

Product 1 from seller -> product A from catalog         -> 85% match 
Product 2 from seller -> product B from catalog         -> 25% match
Product 3 from seller -> product C from catalog         -> 45% match
       ...
Product n from seller -> product D from catalog         -> 73% match


思考过程和匹配百分比计算:
1。将Array1中的每个项目循环通过Array2中的每个数组。
2。如果Array1中的项目在Array2中的数组中可用,则分配值= 1,否则,分配值= 0
3。在此过程中忽略NaN值
4。 A =计算Array1中每个数组中有多少value = 1
5。 B =计算每个产品中有多少个项目
6。 Array1中每个产品的匹配百分比= A / B
7。如果单个产品的匹配百分比可能相同,请采用第一个循环的匹配百分比。

示例:
假设我有2个来自卖方array1_example的产品,从Array1切割而来,而我的产品目录array2_example,是从Array2切割的,

array1_example([['Black', 'Pen','NaN']],
                ['Yellow', 'Pen','New']], dtype=object)

array2_example([['Black', 'Book', 'Big'],
               ['Yellow', 'Pen', 'Small'],
               ['White', 'notebook', 'Medium']],dtype=object)

Output_looping_1  [1, 0], # there is 'Black' and no 'Pen' in ['Black', 'Book', 'Big'] # 1/2=50%
                  [0, 1], # there is no 'Black' and there is 'Pen' in 'Yellow', 'Pen', 'Small' # 1/2=50%
                  [0, 0], # there is no 'Black' and no 'Pen' in ['White', 'notebook', 'Medium'] # 0/2=0%

Output_looping_2  [0,0,0] #check ['Yellow', 'Pen','New'] in ['Black', 'Book', 'Big']  0/3=0%
                  [1,1,0], #check ['Yellow', 'Pen','New'] in ['Yellow', 'Pen', 'Small']  2/3=67%
                  [0,0,0], #check ['Yellow', 'Pen','New'] in ['White', 'notebook', 'Medium']  0/3=0%

Output_final    ['Black', 'Pen','NaN'] ->  ['Black', 'Book', 'Big'] -> 50% match
                ['Yellow', 'Pen','New']->  ['Yellow','Pen','Small'] -> 67% match


Array1:

array([['Mai', 'Dubai', '200ml', ..., 'NaN', 'NaN', 'NaN'],
       ['Mai', 'Dubai', 'Cup', ..., 'NaN', 'NaN', 'NaN'],
       ['Mai', 'Dubai', '1.5', ..., 'NaN', 'NaN', 'NaN'],
       ...,
       ['Seachem', 'Aquavitro', 'Alpha', ..., 'NaN', 'NaN', 'NaN'],
       ['SEACHEM', 'AQUAVITRO', 'VIBRANCE', ..., 'NaN', 'NaN', 'NaN'],
       ['SEACHEM', 'AQUAVITRO', 'CALCIFICATI', ..., 'NaN', 'NaN', 'NaN']],
      dtype=object)

Array2:

array([['2-Piece', 'Glitzi', 'Power', ..., 'NaN', 'NaN', 'NaN'],
       ['15-Piece', 'Bones', 'For', ..., 'NaN', 'NaN', 'NaN'],
       ['Sliced', 'Beets', '425', ..., 'NaN', 'NaN', 'NaN'],
       ...,
       ['Cookies', 'With', 'Hazelnut', ..., 'NaN', 'NaN', 'NaN'],
       ['Apple', 'Kale', 'Avocado', ..., 'NaN', 'NaN', 'NaN'],
       ['Selection', 'Of', 'Six', ..., 'NaN', 'NaN', 'NaN']], dtype=object)

请给我一些指导以解决此问题。谢谢!

0 个答案:

没有答案