Numpy数组(矩阵),其中一个轴索引是字符串

时间:2015-03-23 12:18:05

标签: python numpy dictionary matrix

在numpy中,可以创建一个矩阵并使用方便的切片符号

arr=np.array([[1,2,3], [4,5,6], [7, 8, 9], [10,11,12]])
print (arr[2, :])
print (arr[1:2, 2])

这可以扩展到N维。

但是现在如果我想拥有相同的东西,但是一个轴而不是数字轴,它是一个基于字符串的轴?因此索引元素就像:

print(arr["cylinder", :, :]) #prints all cylinders
print(arr["sphere", 4, 100]) #prints sphere of 4 radius, 100 bar
print(arr[:, 4, 100]) #prints every shape with 4 radius 100 bar

我可以为每个“组合”(所有形状,特定半径,特定压力......所有形状,所有半径,特定压力......特定形状,特定半径,特定压力)制作。一个独特的功能,但这是不可行的,那么我该如何创建呢?

目前,所有内容都存储为词典词典(特别是因为只使用了半径和压力值)。如果底层存储可以作为字典的字典保存 - 但添加切片/索引运算符,这将是金色的!

<小时/> 当前代码(是的,我确实有想法调查kwargs以使当前代码库更好地添加新点) - 这只是为了防止“NP”问题而添加:

class all_measurements(object):
    def __init__(self):
        self.measurements = {}

    def add_measurement(self, measurement):
        shape = measurement.shape
        size = measurement.size
        pressure = measurement.pressure
        fname = measurement.filename
        if shape in self.measurements:
            shape_dict = self.measurements[shape]
        else:
            shape_dict = {}
            self.measurements[shape] = shape_dict

        if size in shape_dict:
            size_dict = shape_dict[size]
        else:
            size_dict ={}
            shape_dict[size] = size_dict

        if pressure in size_dict:
            pressure_dict = size_dict[pressure]
        else:
            pressure_dict = {}
            size_dict[pressure] = pressure_dict

        if fname in pressure_dict:
            print("adding same file twice!")

        pressure_dict[fname] = measurement

    def get_measurements(self, shape = None, size = None, pressure = None, fname = None):
        current_dict = self.measurements
        if shape is None:
            return current_dict
        if shape in current_dict:
            current_dict = current_dict[shape]
        else:
            return None

        if size is None:
            return current_dict
        if size in current_dict:
            current_dict = current_dict[size]
        else:
            return None

        if pressure is None:
            return current_dict
        if pressure in current_dict:
            current_dict = current_dict[pressure]
        else:
            return None

        if fname is None:
            return current_dict
        if fname in current_dict:
            return current_dict[fname]
        else:
            return None

2 个答案:

答案 0 :(得分:1)

我认为你在寻找结构化数组,请参阅here

示例:

>>> import numpy as np

>>> a = np.zeros(10,dtype={'names':['a','b','c'],'formats':['f64','f64','f64']})

# write some data in a
>>> a['a'] = np.arange(10)
>>> a['b'] = np.arange(10,20)
>>> a['c'] = np.arange(20,30)

>>> a
array([(0.0, 10.0, 20.0), 
       (1.0, 11.0, 21.0), 
       (2.0, 12.0, 22.0),
       (3.0, 13.0, 23.0), 
       (4.0, 14.0, 24.0), 
       (5.0, 15.0, 25.0),
       (6.0, 16.0, 26.0), 
       (7.0, 17.0, 27.0), 
       (8.0, 18.0, 28.0),
       (9.0, 19.0, 29.0)], 
  dtype=[('a', '<f4'), ('b', '<f4'), ('c', '<f4')])

>>> a['a'][2:6]
array([ 2.,  3.,  4.,  5.], dtype=float32)

>>> a[4:8]
array([(4.0, 14.0, 24.0), 
       (5.0, 15.0, 25.0), 
       (6.0, 16.0, 26.0),
       (7.0, 17.0, 27.0)], 
  dtype=[('a', '<f4'), ('b', '<f4'), ('c', '<f4')])

答案 1 :(得分:0)

重复使用如下模式:

    if shape in self.measurements:
        shape_dict = self.measurements[shape]
    else:
        shape_dict = {}
        self.measurements[shape] = shape_dict

建议您可以有利地使用collections.defaultdict

当我使用all_measurements(使用我自己的简单类)填充measurements对象时,

A = all_measurements()
A.add_measurement(measurement('round',10,20.0,'test0'))
A.add_measurement(measurement('square',10,30.0,'test1'))
A.add_measurement(measurement('round',1,20.0,'test2'))
print(A.measurements)

我得到一个类似的字典:

{'square': {10: {30.0: {'test1': measurement: square,10,30.0,test1}}},
 'round': {1: {20.0: {'test2': measurement: round,1,20.0,test2}}, 
           10: {20.0: {'test0': measurement: round,10,20.0,test0}}}}

我在这里看不到任何看起来像3d阵列的东西。

我想如果有一套标准的形状,大小和压力,例如

shapes = ['round', 'square', 'flat']
sizes = [1,3,10,20]
pressures = [10.0, 20.0, 30.0]

你可以构造一个3d数组,例如

np.empty((3,4,3))

和将标签映射到索引的字典或元组列表,例如

sizemap={1:0, 3:1, 10:2, 20:3}
sizelist=[(1,0),(3,1)...]

但是这个数组的值是多少? measurement个对象?类型对象的ndarrays是可能的,但通常不会优于嵌套列表或词典。


我测试了你的get_measurements。现在结构化,您必须选择形状,然后选择那些选择的尺寸等。它不能返回具有特定尺寸值的所有形状。

这个方法让我使用索引(包括切片)语法将参数传递给get_measurements

def __getitem__(self, key):
    print(key)
    key = list(key)  # comes in a tuple
    for i,k in enumerate(key):
        if isinstance(k, slice):
            # code to interpret a slice goes here
            key[i] = None # fall back, do nothing
    return self.get_measurements(*key)

pprint(A['round',10])
pprint(A[:,10])
pprint(A['round':'square', 10:30:10])

产生

('round', 10)
{20.0: {'test0': measurement: round,10,20.0,test0}}

(slice(None, None, None), 10)
{'round': ...}

(slice('round', 'square', None), slice(10, 30, 10))
{'round': ...}

你必须决定像

这样的对象
slice('round','square', None)
slice(10, 30, 10)

表示属性

的上下文