有没有一种遍历此代码的更快方法?

时间:2019-07-09 16:40:13

标签: python

我有一长串生成的名称,还有一个包含可接受名称的5000字文件。我想在列表中找到也出现在文件中的名称。我该怎么做?

我尝试使用循环,但这花了我所需的时间,因为我的名称文件太长,无法在整个文件中搜索每个生成的名称。当n为12个数字长时,我生成的列表中有531441个名称。

以下是一些代码:

from time import process_time
from itertools import product
start = process_time()
n = "5747867437"
phone = {2: ["A", "B", "C"], 3: ["D", "E", "F"], 4: {"G", "H", "I"}, 5: ["J", "K", "L"], 6: ["M", "N", "O"], 7: ["P", "R", "S"], 8: ["T", "U", "V"], 9: ["W", "X", "Y"]}
li = set(open("dict.txt", "r").read().strip().split("\n"))
num = []
names = []
for x in n:
    num.append(phone[int(x)])
for y in product(*num):
    names.append(''.join(y))
available = []
ad = False
for z in names:
    if z in li:
        available.append(z)
acceptable.sort()
print(acceptable)
if acceptable:
    for a in acceptable:
        print(a + "\n")
else:
    print("NONE\n")
print(process_time() - start)

文件“ acceptable_names.txt”是其中包含可接受名称的文件。 现在需要3秒钟。有没有办法使它更快?

谢谢!

3 个答案:

答案 0 :(得分:1)

使用集https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset

在列表O(n)中查找 在集合O(1)中找到

# converting list to set
names = set(names)
for z in li:
    if z in names:
        acceptable.append(z)
acceptable.sort()
print(acceptable)

答案 1 :(得分:1)

如上所述,请使用集合之间的交集。像这样:

set_names = set(names)
set_li = set(li)
acceptable = set_names.intersection(set_li)

# if you want to sort it, convert it into a list first
print(list(acceptable).sort()

答案 2 :(得分:1)

根据建议-使用集。修剪代码中不需要的内容。您的代码看起来像MRE:

from itertools import product

def writeAcceptFile(filename):
    with open(filename,"w") as f:
        f.write("JIM\nJON\nTIM\nIKE")

def getNamesFromFile(filename):
    with open(filename) as f:
        return set(name.strip() for name in f.readlines())


fn = "acceptable_names.txt" 
writeAcceptFile(fn)
accept = getNamesFromFile(fn)

phone = {2: ["A", "B", "C"], 3: ["D", "E", "F"], 4: {"G", "H", "I"}, 
         5: ["J", "K", "L"], 6: ["M", "N", "O"], 7: ["P", "R", "S"], 
         8: ["T", "U", "V"], 9: ["W", "X", "Y"]}

n = 566

ok = [k for k in ( ''.join(l) 
                  for l in product(*(phone[int(x)] 
                                     for x in str(n)))) 
      if k in accept]

print(ok) # ['JON']

您可以使用列表和循环来代替“愚蠢的” oneliner:

# or by foot:
names = []
num = []
for x in str(n):
    num.append(phone[int(x)])
for y in product(*num):
    n = ''.join(y)
    # only add name if in accepted list
    if n in accept:
        names.append(''.join(y))

print(names)  # ['JON']

使用集合的原因是,对于包含检查,它们的速度极快(即恒定的时间,其中不存在多少东西)。

您的代码在整个允许的单词列表(5k)中为您生成的每个单词(531441)循环-使其变慢。