如何从defaultdict创建矩阵来实现原始的二元组计数?

时间:2016-02-02 12:48:09

标签: python dictionary n-gram defaultdict

我想实现一些单词的原始二元组计数。为此,我创建了一个defaultdict,它包含两个实体,它们的计数如下:

[(('went','then'),1),(('went','forward'),3),(('go','then'),2)]

因此,为了实现原始的二元组计数,我需要创建一个矩阵,它将是:

       then  forward 
went     1       3
go       2       0

怎么做?我无法找到任何办法。这是一个矩阵。点击编辑查看。

2 个答案:

答案 0 :(得分:0)

确切确定您正在尝试做什么,但下面的代码会从嵌套元组列表中提取数据并将其放入列表列表中。

data = [
    (('went', 'then'), 1), 
    (('went', 'forward'), 3), 
    (('go', 'then'), 2),
]

#Gather row & column keys
rowkeys, colkeys = [list(set(u)) for u in zip(*[t[0] for t in data])]

#Put count data into 2D table
datadict = dict(data)
table = [[datadict.get((r, c), 0) for c in colkeys] for r in rowkeys]

#Dump table
print(' '.join(colkeys))
for r, row in zip(rowkeys, table):
    print(r, row)

<强>输出

forward then
go [0, 2]
went [3, 1]

答案 1 :(得分:0)

此脚本解决您的问题,您必须创建字典词典

data = [(('went', 'then'), 1), (('went', 'forward'), 3), (('go', 'then'), 2)]
res={}
for elm in data :
    dict2={}
    value= elm[1]
    key0= elm[0][0]
    key1= elm[0][1]
    dict2[key1]=value
    res[key0]=dict2

print res['go']['then']
2
print res['went']['forward']
3