我有一个树形结构,每个节点都有一个文档列表(文档长度可以从5到500不等),每个文档包含许多单词。我想存储每个单词以字典形式出现的文档数量。
例如:如果任何节点A = [['b','m','n'],['b'],['g'],['o','b','g'],['b','g']]
处的文档
计数应存储在每个节点中,如A.occurlist = {'b':4,'m':1,'n':1,'g':3,'o':1
我正在运行以下代码,但它无法产生递归并显示错误: TypeError:' dict'对象不可调用
代码:
def occurlist(node):
wordSet = set()
lis = []
next_node = []
child_nodes = node.children
for child in child_nodes:
next_node.append(child)
lis += child.documents
wordSet = set(itertools.chain.from_iterable(lis))
occurDict = {word:0 for word in wordSet}
for child in child_nodes:
occurlist = {}
occurlist.update(copy.deepcopy(occurDict))
for doc in child.documents:
for word in wordSet:
if word in doc:
occurlist[word] +=1
child.occurlist = occurlist
print child.name
print len(next_node)
if next_node:
for nn in next_node:
if not nn.update:
occurlist(nn)
occurlist(Savings_Accounts)
Savings_Accounts是根节点的名称。
答案 0 :(得分:0)
如果您只希望每个元素的计数使用collections.Counter:
from collections import Counter
A = [['b','m','n'],['b'],['g'],['o','b','g'],['b','g']]
dict(Counter([ele for sub in A for ele in set(sub)]))
#{'b': 4, 'm': 1, 'g': 3, 'o': 1, 'n': 1}