我的编程任务是,给定一个输入文件words.txt
,创建一个图形,其中节点是单词本身,边缘是一个字母不同的单词。 (例如," abcdef"和" abcdeg"会有优势。)
对于问题的后期阶段,我确实需要使用邻接列表实现(因此没有邻接矩阵)。如果可能,我不想使用外部库。
我的代码功能齐全,但运行大约需要7秒钟(约5000字)。我需要帮助进一步优化它。我做的主要优化是对字符串中的每个字符使用{" index,char"}的集合并比较集合,这比sum(x!=y for x,y in zip(word1, word2))
这是我到目前为止的代码:
def createGraph(filename):
"""Time taken to generate graph: {} seconds."""
wordReference = {} # Words and their smartsets (for comparisons)
wordEdges = {} # Words and their edges (Graph representation)
edgeCount = 0 # Count of edges
with open(filename) as f:
for raw in f.readlines(): # For each word in the file (Stripping trailing \n)
word = raw.strip()
wordReference[word] = set("{}{}".format(letter, str(i)) for i, letter in enumerate(word)) # Create smartsets in wordReference
wordEdges[word] = set() # Create a set to contain the edges
for parsed in wordEdges: # For each of the words we've already parsed
if len(wordReference[word] & wordReference[parsed]) == 5: # If compare smartSets to see if there should be an edge
wordEdges[parsed].add(word) #Add edge parsed -> word
wordEdges[word].add(parsed) #Add edge word -> parsed
edgeCount += 1 #Add 1 to edgesCount
return wordEdges, edgeCount, len(wordReference) #Return dictionary of words and their edges