特定的动态嵌套字典,自动生成实现

时间:2016-12-07 00:58:08

标签: python dictionary autovivification

我正在尝试以特定方式实现嵌套字典结构。 我正在读一长串的单词。这些词最终需要经常有效地搜索,所以这就是我想要设置词典的方式:

我正在尝试创建一个嵌套的字典结构,其中第一个键值是单词的长度,值是一个dict,键是单词的第一个字母,值是一个dict,键是这个词的第二个字母和值是一个字母,键是第三个字母等等。

所以如果我读“car”“can”和“joe”

我得到了

int main()
{ int k  = 3;
  int* theArray = new int[5];
  theFunc(k, theArray);

 delete[] theArray;
 return(0);
}  

我需要做大约10万个单词,长度从2到27个不等。

我查看了What is the best way to implement nested dictionaries?Dynamic nested dictionaries

但是没有任何运气可以解决这个问题。

我当然可以使用

从文本文件中删除我的文字
{3: {c: {a: {r: car, n: can}}},j: {o: {e: joe}}}

我可以使用

分成每个角色
for word in text_file.read().split()

for char in word

我无法弄清楚如何让这个结构失效。任何帮助将不胜感激。

3 个答案:

答案 0 :(得分:3)

这是一个关于如何在defaultdict上构建具有自动生成功能的trie的简短示例。对于终止单词的每个节点,它存储额外的键 // Action script… // [onClipEvent of sprite 2 in] //onClipEvent (load) this.addEventListener(Event.ENTER_FRAME, this.loading); function loading(e:Event) { this.numItems = 2; } // [onClipEvent of sprite 2 in] //onClipEvent (load) this.addEventListener(Event.ENTER_FRAME, loading); function loading(event:Event){ this.numItems = 3; } // [onClipEvent of sprite 2 in] //onClipEvent (load) this.addEventListener(Event.ENTER_FRAME, loading); function loading(event:Event){ this.numItems = 3; } // [onClipEvent of sprite 8 in] //onClipEvent (load) this.addEventListener(Event.ENTER_FRAME, loading); function loading(event:Event){ this.defx = _x; this.defy = _y; if (this.theText.text.length > 20) { this.theText._height = 31; this.theBox._height = 27; } else { this.theText._height = 19; this.theBox._height = 19; } // end else if } // [onClipEvent of sprite 8 in frame 1] //on (press) this.addEventListener(MouseEvent.MOUSE_DOWN, pressed); function pressed(event:Event){ if (this.noDrag != true) { startDrag (this, false); } // end if } // [onClipEvent of sprite 8 in frame 1] //on (release) this.addEventListener(MouseEvent.MOUSE_UP, relea); function relea(event:Event){ if (this.noDrag != true) { stopDrag (); if(this.hitTest(_root["dz" + this.answer])) { totalHeight = 0; for (v = 0; v < _root.dbCount; v++) { if (_root["dbutton" + v].answer == this.answer) { totalHeight = totalHeight + _root["button" + v].theBox._height ; } // end if } // end of for ++_root.dbCount; this .duplicateMovieClip("dbutton" + _root.dbCount, _root.dbCount * 100); _root["dbutton" + _root.dbCount]._x = this.defX; _root["dbutton" + _root.dbCount]._y = this.defY; _root["dbutton" + _root.dbCount].answer = _root.answerdest[_root.dbCount + 1]; _root["dbutton" + _root.dbCount].theText.text = _root.answername[_root.dbCount +1]; if (_root["dbutton" + _root.dbCount].theText.text == "undefined") { _root["dbutton" + _root.dbCount].theText.text = "Finished!"; _root["dbutton" + _root.dbCount].noDrag = true; } // end if this.noDrag = true; this._y = _root["dz" + this.answer]._y + totalHeight; this._x = _root["dz" + this.answer]._x - _root["dz" + this.answer]._width / 2; ++_root["dz" + this.answer].numItems; _root.glamp1.gotoAndPlay (1); } else { this.x = this.defX; this._y = this.defY; _root.rlamp1.gotoAndPlay(1); } // end if } // end else if } // [onClipEvent of sprite 2 in frame 1] //onClipEvent (load) this.addEventListener(Event.ENTER_FRAME, loading); function loading(event:Event){ this.numItems = 2; } // [onClipEvent of sprite 2 in frame 1] //onClipEvent (load) this.addEventListener(Event.ENTER_FRAME, loading); function loading(event:Event){ this.numItems = 3; } // [Action in Frame 1] answername = Array(); answerdest = Array(); answername[0] = "gravel"; answerdest[0] = "1"; answername[1] = "water"; answerdest[1] = "2"; dbCount = 0; dbutton.duplicateMovieClip("dbutton" + dbCount,dbCount * 100); dbutton.visible = false; dbutton0.answer = answerdest[dbCount]; dbutton0.theText.text = answername[dbCount]; 以指示它。

term

输出:

from collections import defaultdict

trie = lambda: defaultdict(trie)

def add_word(root, s):
    node = root
    for c in s:
        node = node[c]
    node['term'] = True

def list_words(root, length, prefix=''):
    if not length:
        if 'term' in root:
            yield prefix
        return

    for k, v in root.items(): 
        if k != 'term':
            yield from list_words(v, length - 1, prefix + k)

WORDS = ['cars', 'car', 'can', 'joe']
root = trie()
for word in WORDS:
    add_word(root, word)

print('Length {}'.format(3))
print('\n'.join(list_words(root, 3)))
print('Length {}'.format(4))
print('\n'.join(list_words(root, 4)))

答案 1 :(得分:2)

不确定您对此结构的目的是什么,这里是一个使用递归生成您描述的结构的解决方案:

from collections import defaultdict
d = defaultdict(list)
words = ['hello', 'world', 'hi']


def nest(d, word):
    if word == "":
        return d
    d = {word[-1:]: word if d is None else d}
    return nest(d, word[:-1])


for word in words:
    l = len(word)
    d[l].append(nest(None, word))

print(d)

答案 2 :(得分:1)

这里有一种方法使用collections.defaultdict创建自己的dict自定义子类所以生成的字典只是一个普通的{{1} object:

dict

输出:

import pprint

def _build_dict(wholeword, chars, val, dic):
    if len(chars) == 1:
        dic[chars[0]] = wholeword
        return
    new_dict = dic.get(chars[0], {})
    dic[chars[0]] = new_dict
    _build_dict(wholeword, chars[1:], val, new_dict)

def build_dict(words):
    dic = {}
    for word in words:
        root = dic.setdefault(len(word), {})
        _build_dict(word, list(word), word[1:], root)
    return dic

words = ['a', 'ox', 'car', 'can', 'joe']
data_dict = build_dict(words)
pprint.pprint(data_dict)

它基于python.org Python列表档案帖子中标题为Building and Transvering multi-level dictionaries的消息中所示的递归算法。