检查网址是否存在拼错的品牌名称

时间:2016-10-15 14:14:03

标签: python optimization

我有一个网址,其中可能包含也可能没有拼写错误的品牌名称。假设拼写错误是该品牌的众多排列之一。我想检查它的存在。我写了下面的代码,虽然复杂性非常高......

import re
from itertools import permutations
url = "http://www.amazno.com/"
brands = [...]
# ^ this is a Set of 25,000 brand names in lowercase, retrieved from Alexa.
# it has "google" and "amazon" in it, for example.

for brand in brands:
    # get all permutations of this brand
    perms_list = ["".join(p) for p in permutations(brand)]

    # remove duplicates by typecasting into a Set
    perms = set(perms_list)

    for perm in perms:
        # search the URL for the permutation
        m = re.search(perm, url)
        if m:
            return 1
return 0

有更快的方法吗?

0 个答案:

没有答案