Question

I have dictionary like this:

a = dict(zip( ['k1', 'k2', 'k3', 'k4'], 
          ... [ [1,2,3,4], [10,20,30,40], [100,200,300,400], [1000,2000,3000,4000]])

>>> a
{'k1': [1, 2, 3, 4], 'k2': [10, 20, 30, 40], 'k3': [100, 200, 300, 400], 'k4': [1000, 2000, 3000, 4000]}

What I want to do: get values for several keys and create multidimensional numpy array from them. Something like this:

result = numpy.array( [a[x] for x in ('k1' , 'k3')]

I tried this code:

ar = numpy.array([])
for el in ['k1', 'k3']:
     ar = numpy.r_[ar, num_dict[el]]
ar = ar.reshape(2,len(ar)/2)

But are there some built in functions or more elegant ways?

Answer 1

列表列表是np.array的正常输入，因此您的列表理解是有意义的。

In [382]: [a[x] for x in ['k1','k3']]
Out[382]: [[1, 2, 3, 4], [100, 200, 300, 400]]

或者整个字典

In [385]: np.array(list(a.values()))    # list required in py3
Out[385]: 
array([[1000, 2000, 3000, 4000],
       [   1,    2,    3,    4],
       [  10,   20,   30,   40],
       [ 100,  200,  300,  400]])

通常，在理解中逐个选择字典项目。 operator有一个便利类，可以通过一个电话获取多个键（我不认为它在速度上有很大差异）：

In [386]: import operator
In [387]: operator.itemgetter('k1','k3')(a)
Out[387]: ([1, 2, 3, 4], [100, 200, 300, 400])

我不认为r_的迭代是一个不错的选择。 r_只是concatenate的封面。如果必须迭代，重复concatante会变慢。最好建立一个列表，并在最后创建数组（如列表解析中所示）。

Answer 2

我需要一个数据中的numpy数组，所以我找不到没有循环的方法。我创建函数：

def loadFromDict( fieldnames, dictionary ):
    ''' fieldnames - list of needed keys, dictionary - dict for extraction
     result - numpy.array size of (number of keys, lengths of columns in dict)'''
    ar = numpy.zeros( (len(fieldnames), len(dictionary[fieldnames[0]])) )
    for c,v in enumerate(fieldnames,0):
        ar[c,:] = dictionary[v]
    return ar

在我的案例中，字典对于所有列都具有相同的长度。无论如何，它很容易实现它们是不同的：使用[len(v) for v in dictionary.values()]获取所有长度，或找到当前键的长度。

Create multidimensional numpy array from specific keys of dictionary

2 个答案: