如何从更大的数组python创建一个更小的数组

时间:2013-11-27 01:25:43

标签: python arrays numpy indexing where

我有一个大的4D数据集,需要从中创建一个较小的4D数组。我对python很新,我习惯使用IDL或matlab。我读了我的值,然后使用where函数,我找到了较小的1D数组中每个维度所需的索引号。我试图从这些索引号创建一个新的数组,但我不断得到形状不匹配错误(不能将其扩展为单个形状。

import numpy as n
import matplotlib.pyplot as plt
import Scientific.IO.NetCDF as S

file=S.NetCDFFile('wspd.mon.mean.nc',mode='r') #Opening File
Lat=file.variables['lat'].getValue()     # Reading in the latitude variables, 73
Lon=file.variables['lon'].getValue()     # Reading in the longitude variables, 144
Level=file.variables['level'].getValue() # Reading in the levels, 17 of them
Defaulttime=file.variables['time'].getValue()   # Reading in the time, hrs since 1-1-1
Defaultwindspeed=file.variables['wspd'].getValue()     # Reading in the windspeed(time, level, lat, lon)

Time=n.arange(len(Defaulttime))/12.+1948  #Creates time array into readable years with 12 months
goodtime=n.where((Time>=1948)&(Time<2013)) #Creates a time array for the years that I want, 1948-2012, since 2013 only has until October, I will not be using that data.
goodlat=n.where((Lat>=35)&(Lat<=50))  #Latitudes where the rockies and plains are in the US
plainslon=n.where((Lon>=275)&(Lon<=285))

Windspeedsplains=Defaultwindspeed[goodtime,:,goodlat,plainslon]

以下错误由上一行(最后一行代码)生成。

>>>ValueError: shape mismatch: objects cannot be broadcast to a single shape

1 个答案:

答案 0 :(得分:0)

发生的事情是每个索引(where)数组的长度都不同,因此输出数组的形状不明确,因此出错。要强制播放到正确的形状,你必须重新整形阵列以广播到适当的形状,如下所示:

Windspeedsplains = Defaultwindspeed[goodtime[0][:, None, None, None],:,goodlat[0][:,None],plainslon[0]]

[0]是因为np.where(a)返回一个长度为a.ndim的元组,元组的每个元素都是适合您条件的索引数组。我假设你的所有布尔数组都是1d,所以所有where输出都是长度为1的元组,所以我们只想要一个数组,因此[0]

在我们获得数组之后,我们想要重新整形它以匹配您希望输出数组具有的形状。可推测,您的输出应具有的形状为(goodtime.size, Defaultwindspeed.shape[1], goodlat.size, plainslon.size),因此您必须使每个索引数组的形状与输出数组对于该变量应该变化的轴匹配。例如,对于goodtime,您希望Windspeedplains沿着0轴的轴4随时间变化。因此,goodtime本身也必须仅沿四个轴的0个变化,因此您强制索引数组的形状(N, 1, 1, 1)[:, None, None, None]所做的。

因此,您可以通过以下方式使上述行更具可读性:

goodtime = n.where((Time>=1948)&(Time<2013))[0][:, None, None, None]
goodlat = n.where((Lat>=35)&(Lat<=50))[0][:, None]
plainslon = n.where((Lon>=275)&(Lon<=285))[0]

Windspeedsplains=Defaultwindspeed[goodtime, :, goodlat, plainslon]

或者实际上,因为您可以直接使用布尔数组进行索引:

goodtime = ((Time>=1948)&(Time<2013))[:, None, None, None]
goodlat = ((Lat>=35)&(Lat<=50))[:, None]
plainslon = ((Lon>=275)&(Lon<=285))

Windspeedsplains=Defaultwindspeed[goodtime, :, goodlat, plainslon]

这是一个稍微简单的例子:

In [52]: a = np.arange(3*3*3).reshape(3,3,3)

In [53]: a
Out[53]: 
array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]],

       [[18, 19, 20],
        [21, 22, 23],
        [24, 25, 26]]])

In [54]: mask0 = np.where(a[:,0,0] >= 9)

In [55]: mask0
Out[55]: (array([1, 2]),)   # <-- this is the length 1 tuple I was talking about. we want the array inside.

In [56]: mask1 = np.where(a[0,:,0]%2 == 0)

In [57]: mask1
Out[57]: (array([0, 2]),)

In [62]: mask2 = np.where(a[0,0,:] < 1)

In [63]: mask2
Out[63]: (array([0]),)

In [67]: b = a[mask0[0][:, None, None], mask1[0][:, None], mask2[0]]

In [68]: b
Out[68]: 
array([[[ 9],
        [15]],

       [[18],
        [24]]])

In [69]: b.shape
Out[69]: (2, 2, 1)