如何优化多维数据数组上的一维插值?

时间:2021-04-05 07:20:40

标签: interpolation dask python-xarray numpy-ufunc

我有一个 4D 数据数组,我只沿垂直轴对它进行插值。

from scipy.interpolate import interp1d

#data array dims
da[time,plev,la,lon]

#array with vertical levels
lev = da.plev

#new temperatures ->dummy values
tem = np.arange(10,100,5)

#begin loop for interpolation
for time in range(da.time.size):
    for lat in range(da.lat.size):
        for lon in range(da.lon.size):
            f = interp1d(da[time,:,lat,lon],lev,fill_value='extrapolate')
            holder[time,:,lat,lon] = f(tem)

代码有效,但需要一段时间才能运行。我仍在学习 apply_ufunc 和 Dask,我看到了一些示例 here,我认为这有助于大大减少运行时间(至少与 for 循环相比)。

我试图运行类似的东西

# return a tuple of DataArrays
res = xr.apply_ufunc(interp1d, hus, lev,
        input_core_dims=[['plev'], ['plev']],
        output_core_dims=[[]],
        vectorize=True)

但是当我尝试使用插值函数时:

holder = res(tem)

我收到一条错误消息:DataArray 对象不可调用。

更新

我尝试了以下代码,将内插器放入函数中。我知道它在工作,因为我在 return 语句之前打印了一些结果。但问题在于 return 语句。

def interp(x, y):
    # Wrapper around scipy linregress to use in apply_ufunc
    f = interp1d(x,y,fill_value='extrapolate')
    new = f(tem)
    return (new)

holder_new = xr.apply_ufunc(interp, check, p,
        input_core_dims=[['plev'], ['plev']],
        output_core_dims=[[]],
        vectorize=True)

错误信息:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: only size-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
<timed exec> in <module>

~/anaconda3/lib/python3.7/site-packages/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, meta, dask_gufunc_kwargs, *args)
   1108             join=join,
   1109             exclude_dims=exclude_dims,
-> 1110             keep_attrs=keep_attrs,
   1111         )
   1112     # feed Variables directly through apply_variable_ufunc

~/anaconda3/lib/python3.7/site-packages/xarray/core/computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, *args)
    260 
    261     data_vars = [getattr(a, "variable", a) for a in args]
--> 262     result_var = func(*data_vars)
    263 
    264     if signature.num_outputs > 1:

~/anaconda3/lib/python3.7/site-packages/xarray/core/computation.py in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, vectorize, keep_attrs, dask_gufunc_kwargs, *args)
    698             )
    699 
--> 700     result_data = func(*input_data)
    701 
    702     if signature.num_outputs == 1:

~/anaconda3/lib/python3.7/site-packages/numpy/lib/function_base.py in __call__(self, *args, **kwargs)
   2106             vargs.extend([kwargs[_n] for _n in names])
   2107 
-> 2108         return self._vectorize_call(func=func, args=vargs)
   2109 
   2110     def _get_ufunc_and_otypes(self, func, args):

~/anaconda3/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call(self, func, args)
   2180         """Vectorized call to `func` over positional `args`."""
   2181         if self.signature is not None:
-> 2182             res = self._vectorize_call_with_signature(func, args)
   2183         elif not args:
   2184             res = func()

~/anaconda3/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call_with_signature(self, func, args)
   2244 
   2245             for output, result in zip(outputs, results):
-> 2246                 output[index] = result
   2247 
   2248         if outputs is None:

ValueError: setting an array element with a sequence.

0 个答案:

没有答案