我正在尝试在hackerrank.com上编码此问题:
https://www.hackerrank.com/challenges/find-strings
我的代码可以很好地处理小案例,但是在大案例中我的字典很快就耗尽了内存。我该怎么做才能解决这个问题?我不想使用列表,因为检查条目是否已经存在需要很长时间......这是我的代码:
n = int(raw_input())
words = []
for x in range(n):
words.append(raw_input())
test = int(raw_input())
queries = []
for x in range(test):
queries.append(raw_input())
dict_of_subwords = {}
for x in words:
len_of_x = len(x)
for i in range(len_of_x):
for j in range(i, len_of_x):
dict_of_subwords[x[i:j+1]] = 1
list_of_subwords = dict_of_subwords.keys()
list_of_subwords.sort()
for x in queries:
try:
print list_of_subwords[int(x)-1]
except:
print "INVALID"
答案 0 :(得分:0)
由于有关制作更高内存效率版本的许多建议,这里有一个试图最小化存储量的版本(同时仍使用相同的算法方法):
subwords = set()
num_words = int(raw_input())
for i in xrange(num_words):
word = raw_input()
for i in xrange(len(word)):
for j in xrange(i, len(word)):
subwords.add(word[i:j+1])
subwords = sorted(subwords)
num_queries = int(raw_input())
for x in range(num_queries):
query = raw_input()
try:
print subwords[int(query)-1]
except:
print "INVALID"
答案 1 :(得分:0)
您必须使用后缀数组wiki
suffix array implementation in python:
后缀数组与后缀树密切相关: