我正在尝试解决here.所描述的键盘自动完成问题 问题是计算一个单词所需的击键次数,给出一些字典和自动完成规则。例如,对于字典:
data = ['hello', 'hell', 'heaven', 'goodbye']
我们得到以下结果(请参阅上面的链接以获得进一步的解释):
{'hell': 2, 'heaven': 2, 'hello': 3, 'goodbye': 1}
快速说明:如果用户输入h
,则e
会自动填充,因为以h
开头的所有字词都有e
作为第二个字母。现在,如果用户输入l
,则填充另一个l
,为单词hell
提供2个笔画。当然,hello
需要再多一次中风。请参阅上面的链接以获取更多示例。
我的Trie
代码如下,它可以正常工作(取自https://en.wikipedia.org/wiki/Trie)。 Stack
代码是从根解析树(参见下面的编辑):
class Stack(object):
def __init__(self, size):
self.data = [None]*size
self.i = 0
self.size = size
def pop(self):
if self.i == 0:
return None
item = self.data[self.i - 1]
self.i-= 1
return item
def push(self, item):
if self.i >= self.size:
return None
self.data[self.i] = item
self.i+= 1
return item
def __str__(self):
s = '# Stack contents #\n'
if self.i == 0:
return
for idx in range(self.i - 1, -1, -1):
s+= str(self.data[idx]) + '\n'
return s
class Trie(object):
def __init__(self, value, children):
self.value = value #char
self.children = children #{key, trie}
class PrefixTree(object):
def __init__(self, data):
self.root = Trie(None, {})
self.data = data
for w in data:
self.insert(w, w)
def insert(self, string, value):
node = self.root
i = 0
n = len(string)
while i < n:
if string[i] in node.children:
node = node.children[string[i]]
i = i + 1
else:
break
while i < n:
node.children[string[i]] = Trie(string[:i], {})
node = node.children[string[i]]
i = i + 1
node.value = value
def find(self, key):
node = self.root
for char in key:
if char in node.children:
node = node.children[char]
else:
return None
return node
我无法弄明白如何计算笔画数:
data = ['hello', 'hell', 'heaven', 'goodbye']
tree = PrefixTree(data)
strokes = {w:1 for w in tree.data} #at least 1 stroke is necessary
这是从根解析树的代码:
stack = Stack(100)
stack.push((None, pf.root))
print 'Key\tChilds\tValue'
print '--'*25
strokes = {}
while stack.i > 0:
key, curr = stack.pop()
# if something:
#update strokes
print '%s\t%s\t%s' % (key, len(curr.children), curr.value)
for key, node in curr.children.items():
stack.push((key, node))
print strokes
任何想法或建设性意见都会有所帮助,谢谢!
@SergiyKolesnikov的精彩回答。可以进行一项小的更改,以避免调用endsWith()
。我刚刚给Trie类添加了一个布尔字段:
class Trie(object):
def __init__(self, value, children, eow):
self.value = value #char
self.children = children #{key, trie}
self.eow = eow # end of word
在insert()的末尾:
def insert(self, string, value):
#...
node.value = value
node.eow = True
然后将curr.value.endswith('$'):
替换为curr.eow
。谢谢大家!
答案 0 :(得分:2)
你的例子的trie可能看起来像这样
' '
| \
H G
| |
E O
| \ |
L A O
| | |
L$ V D
| | |
O E B
| |
N Y
|
E
可以将trie中的哪些节点视为用户击键的标记?这种节点有两种类型:
$
)的节点,因为如果当前单词不是需要的话,用户必须键入下一个字母。在递归遍历trie时,计算在到达单词的最后一个字母之前遇到了多少这些标记节点。此计数是单词所需的笔画数。
对于&#34; hell&#34;它是两个标记节点:' '
和E
(2笔画)
对于&#34;你好&#34;它是三个标记节点:' '
,E
,L$
(3笔)
等等...
您的实施需要更改的内容:
需要在树中标记有效单词的结尾,以便可以检查第二个条件。因此,我们从
更改PrefixTree.insert()
方法的最后一行
node.value = value
到
node.value = value + '$'
现在我们为每个堆栈项添加一个笔画计数器(三次推送到堆栈中的最后一个值)以及增加计数器的检查:
stack = Stack(100)
stack.push((None, tree.root, 0)) # We start with stroke counter = 0
print('Key\tChilds\tValue')
print('--'*25)
strokes = {}
while stack.i > 0:
key, curr, stroke_counter = stack.pop()
if curr.value is not None and curr.value.endswith('$'):
# The end of a valid word is reached. Save the word and the corresponding stroke counter.
strokes[curr.value[:-1]] = stroke_counter
if len(curr.children) > 1:
# Condition 2 is true. Increase the stroke counter.
stroke_counter += 1
if curr.value is not None and curr.value.endswith('$') and len(curr.children) > 0:
# Condition 1 is true. Increase the stroke counter.
stroke_counter += 1
print('%s\t%s\t%s' % (key, len(curr.children), curr.value))
for key, node in curr.children.items():
stack.push((key, node, stroke_counter)) # Save the stroke counter
print(strokes)
<强>输出:强>
Key Childs Value
--------------------------------------------------
None 2 None
h 1
e 2 h
a 1 he
v 1 hea
e 1 heav
n 0 heaven$
l 1 he
l 1 hell$
o 0 hello$
g 1
o 1 g
o 1 go
d 1 goo
b 1 good
y 1 goodb
e 0 goodbye$
{'heaven': 2, 'goodbye': 1, 'hell': 2, 'hello': 3}
答案 1 :(得分:1)
当你浏览堆栈时,你应该为每个节点保留一个笔画计数器:
出于文档目的,这是我的Ruby回答:
class Node
attr_reader :key, :children
attr_writer :final
def initialize(key, children = [])
@key = key
@children = children
@final = false
end
def final?
@final
end
end
class Trie
attr_reader :root
def initialize
@root = Node.new('')
end
def add(word)
node = root
word.each_char.each{|c|
next_node = node.children.find{|child| child.key == c}
if next_node then
node = next_node
else
next_node = Node.new(c)
node.children.push(next_node)
node = next_node
end
}
node.final = true
end
def count_strokes(node=root,word="",i=0)
word=word+node.key
strokes = {}
if node.final? then
strokes[word]=i
if node.children.size>0 then
i+=1
end
elsif node.children.size>1 then
i+=1
end
node.children.each{|c|
strokes.merge!(count_strokes(c, word, i))
}
strokes
end
end
data = ['hello', 'hell', 'heaven', 'goodbye']
trie = Trie.new
data.each do |word|
trie.add(word)
end
# File.readlines('/usr/share/dict/british-english').each{|line|
# trie.add line.strip
# }
puts trie.count_strokes
#=> {"hell"=>2, "hello"=>3, "heaven"=>2, "goodbye"=>1}
仅60行,10万字不到3秒。