Create multidimensional numpy array from specific keys of dictionary

时间:2017-10-30 15:32:53

标签: python arrays numpy dictionary vectorization

I have dictionary like this:

a = dict(zip( ['k1', 'k2', 'k3', 'k4'], 
          ... [ [1,2,3,4], [10,20,30,40], [100,200,300,400], [1000,2000,3000,4000]])

>>> a
{'k1': [1, 2, 3, 4], 'k2': [10, 20, 30, 40], 'k3': [100, 200, 300, 400], 'k4': [1000, 2000, 3000, 4000]}

What I want to do: get values for several keys and create multidimensional numpy array from them. Something like this:

result = numpy.array( [a[x] for x in ('k1' , 'k3')]

I tried this code:

ar = numpy.array([])
for el in ['k1', 'k3']:
     ar = numpy.r_[ar, num_dict[el]]
ar = ar.reshape(2,len(ar)/2)

But are there some built in functions or more elegant ways?

2 个答案:

答案 0 :(得分:1)

列表列表是np.array的正常输入,因此您的列表理解是有意义的。

In [382]: [a[x] for x in ['k1','k3']]
Out[382]: [[1, 2, 3, 4], [100, 200, 300, 400]]

或者整个字典

In [385]: np.array(list(a.values()))    # list required in py3
Out[385]: 
array([[1000, 2000, 3000, 4000],
       [   1,    2,    3,    4],
       [  10,   20,   30,   40],
       [ 100,  200,  300,  400]])

通常,在理解中逐个选择字典项目。 operator有一个便利类,可以通过一个电话获取多个键(我不认为它在速度上有很大差异):

In [386]: import operator
In [387]: operator.itemgetter('k1','k3')(a)
Out[387]: ([1, 2, 3, 4], [100, 200, 300, 400])

我不认为r_的迭代是一个不错的选择。 r_只是concatenate的封面。如果必须迭代,重复concatante会变慢。最好建立一个列表,并在最后创建数组(如列表解析中所示)。

答案 1 :(得分:0)

我需要一个数据中的numpy数组,所以我找不到没有循环的方法。 我创建函数:

def loadFromDict( fieldnames, dictionary ):
    ''' fieldnames - list of needed keys, dictionary - dict for extraction
     result - numpy.array size of (number of keys, lengths of columns in dict)'''
    ar = numpy.zeros( (len(fieldnames), len(dictionary[fieldnames[0]])) )
    for c,v in enumerate(fieldnames,0):
        ar[c,:] = dictionary[v]
    return ar

在我的案例中,字典对于所有列都具有相同的长度。无论如何,它很容易实现它们是不同的:使用[len(v) for v in dictionary.values()]获取所有长度,或找到当前键的长度。