是否可以在numpy记录数组中别名多个名称?

时间:2015-05-17 20:34:16

标签: numpy

假设我像这样构建一个numpy记录数组

num_rows = <whatever>
data = np.zeros(
    (num_rows,),
    dtype={
        'names':['apple', 'banana'],
        'formats': ['f8', 'f8']
    }

现在我可以通过名称或索引访问数据。 例如,以下内容相同:

data['banana'][0]

data[0]['banana']

等。 有没有办法别名不同的名字? 例如,我可以进行设置,以便有另一个名称manzana,以便

data['manzana']

相同
data['apple']

2 个答案:

答案 0 :(得分:3)

['offsets'和'titles'是为字段指定不同名称的两种机制]

有一个offset参数可以通过这种方式运行。通常它用于将另一个字段分成几个部分(例如,int到字节)。但它也适用于相同的领域。实际上,它定义了几个具有重叠数据的字段。

In [743]: dt=np.dtype({'names':['apple','manzana','banana','guineo'],
       'formats':['f8','f8','f8','f8'], 
       'offsets':[0,0,8,8]})

In [745]: np.zeros((3,),dtype=dt)
Out[745]: 
array([(0.0, 0.0, 0.0, 0.0), (0.0, 0.0, 0.0, 0.0), (0.0, 0.0, 0.0, 0.0)], 
      dtype={'names':['apple','manzana','banana','guineo'], 
        'formats':['<f8','<f8','<f8','<f8'], 
        'offsets':[0,0,8,8], 'itemsize':16})

In [746]: A=np.zeros((3,),dtype=dt)

In [747]: A['banana']=[1,2,3]

In [748]: A
Out[748]: 
array([(0.0, 0.0, 1.0, 1.0),  
       (0.0, 0.0, 2.0, 2.0), 
       (0.0, 0.0, 3.0, 3.0)], 
      dtype={'names':['apple','manzana','banana','guineo'], 'formats':['<f8','<f8','<f8','<f8'], 'offsets':[0,0,8,8], 'itemsize':16})

In [749]: A['guineo']
Out[749]: array([ 1.,  2.,  3.])

In [750]: A['manzana']=[.1,.2,.3]

In [751]: A['apple']
Out[751]: array([ 0.1,  0.2,  0.3])

In [752]: A
Out[752]: 
array([(0.1, 0.1, 1.0, 1.0),  
       (0.2, 0.2, 2.0, 2.0), 
       (0.3, 0.3, 3.0, 3.0)], 
      dtype={'names':['apple','manzana','banana','guineo'], 'formats':['<f8','<f8','<f8','<f8'], 'offsets':[0,0,8,8], 'itemsize':16})

还有另一个dtype参数titles更适合您的需求,并且更容易理解:

http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html

In [792]: dt1=np.dtype({'names':['apple','banana'],'formats':['f8','f8'], 'titles':['manzana', 'guineo'], 'offsets':[0,8]})

In [793]: A1=np.zeros((3,),dtype=dt1)

In [794]: A1
Out[794]: 
array([(0.0, 0.0), (0.0, 0.0), (0.0, 0.0)], 
      dtype=[(('manzana', 'apple'), '<f8'), (('guineo', 'banana'), '<f8')])

In [795]: A1['apple']=[1,2,3]

In [796]: A1['guineo']=[.1,.2,.3]

In [797]: A1
Out[797]: 
array([(1.0, 0.1), (2.0, 0.2), (3.0, 0.3)], 
      dtype=[(('manzana', 'apple'), '<f8'), (('guineo', 'banana'), '<f8')])

In [798]: A1['banana']
Out[798]: array([ 0.1,  0.2,  0.3])

答案 1 :(得分:0)

我已将@hpaulj的答案放入一个简单的方法并在此处分享,以防有人想要使用它。

def add_alias(arr, original, alias):
    """
    Adds an alias to the field with the name original to the array arr.
    Only one alias per field is allowed.
    """

    if arr.dtype.names is None:
        raise TypeError("arr must be a structured array. Use add_name instead.")
    descr = arr.dtype.descr

    try:
        index = arr.dtype.names.index(original)
    except ValueError:
        raise ValueError("arr does not have a field named '" + str(original) 
                         + "'")

    if type(descr[index][0]) is tuple:
        raise ValueError("The field " + str(original) + 
                         " already has an alias.")

    descr[index] = ((alias, descr[index][0]), descr[index][1])
    arr.dtype = np.dtype(descr)
    return arr

def add_name(arr, name):
    """
    Adds a name to the data of an unstructured array.
    """

    if arr.dtype.names is not None:
        raise TypeError("arr must not be a structured array. "
                        + "Use add_alias instead.")
    arr.dtype = np.dtype([(name, arr.dtype.name)])
    return arr