从dict创建数组

时间:2016-03-22 09:49:56

标签: python numpy list-comprehension

我在字典中有一些单词,根据这些和一些句子,我想创建一个特定的数组。

words = {'a': array([ 1.78505888, -0.40040435, -0.2555062 ]), 'c': array([ 0.58101204, -0.23254054, -0.5700197 ]), 'b': array([ 1.17213122,  0.38232652, -0.78477569]), 'd': array([-0.07545012, -0.10094538, -0.98136142])}

sentences = [['a', 'c'], ['b', 'a', 'd'], ['d', 'c']]

我想要的是得到一个数组,第一行是'a'和'c'垂直堆叠的值。
第二行是垂直堆叠的'b'和'a'的值 第三,'d'和'c'的值垂直堆叠。

我试过了:

np.vstack((words[word] for word in sentences[0]))
>>> array([[ 1.78505888, -0.40040435, -0.2555062 ],
   [ 0.58101204, -0.23254054, -0.5700197 ]])

所以这是我的第一行,但是我无法使用列表理解(仅适用于一个)来执行“句子”。

编辑: 基本上我正在尝试做的是以下

first_row = np.vstack((words[word] for word in sentences[0]))
second_row = np.vstack((words[word] for word in sentences[1]))
third_row = np.vstack((words[word] for word in sentences[2]))

l = []
l.append(first_row)
l.append(second_row)
l.append(third_row)

print np.array(l)
>>> [[[ 1.78505888 -0.40040435 -0.2555062 ]
      [ 0.58101204 -0.23254054 -0.5700197 ]]

     [[ 1.17213122  0.38232652 -0.78477569]
      [ 1.78505888 -0.40040435 -0.2555062 ]
      [-0.07545012, -0.10094538, -0.98136142]]

     [[-0.07545012 -0.10094538 -0.98136142]
      [ 0.58101204 -0.23254054 -0.5700197 ]]]

1 个答案:

答案 0 :(得分:2)

您可以使用np.searchsortedwords的字符串键和sentences的每个元素中的字符串之间建立对应关系。对sentences中的所有元素重复此过程以获得最终结果。因此,我们只需要一个级别的循环来解决它。实现看起来像这样 -

K = words.keys()
sortidx = np.argsort(K)
V = np.vstack(words.values())[sortidx]
out = [V[np.searchsorted(K,S,sorter=sortidx)] for S in sentences]

示例运行 -

In [122]: words
Out[122]: 
{'a': array([ 1.78505888, -0.40040435, -0.2555062 ]),
 'b': array([ 1.17213122,  0.38232652, -0.78477569]),
 'c': array([ 0.58101204, -0.23254054, -0.5700197 ]),
 'd': array([-0.07545012, -0.10094538, -0.98136142])}

In [123]: sentences
Out[123]: [['a', 'c'], ['b', 'a', 'd'], ['d', 'c']]

In [124]: K = words.keys()
     ...: sortidx = np.argsort(K)
     ...: V = np.vstack(words.values())[sortidx]
     ...: out = [V[np.searchsorted(K,S,sorter=sortidx)] for S in sentences]
     ...: 

In [125]: out
Out[125]: 
[array([[ 1.78505888, -0.40040435, -0.2555062 ],
        [ 0.58101204, -0.23254054, -0.5700197 ]]),
 array([[ 1.17213122,  0.38232652, -0.78477569],
        [ 1.78505888, -0.40040435, -0.2555062 ],
        [-0.07545012, -0.10094538, -0.98136142]]),
 array([[-0.07545012, -0.10094538, -0.98136142],
        [ 0.58101204, -0.23254054, -0.5700197 ]])]