如何从HDF文件中提取新的Xarray数据集?

时间:2019-06-10 14:56:34

标签: dataset python-xarray hdf

我有一个30层的hdf文件(Modis数据)。我使用xarray读取了此文件,并希望使用FP_latitudeFP_longitudeFP_power创建一个新的xarray数据集。在新的数据集中,FP_latitudeFP_longitude将是尺寸和坐标,而FP_power必须是数据变量。

我的数据:

import xarray as xr
ds = xr.open_dataset('~/MOD14.A2018121.0725.006.2018121153451.hdf')
ds.info


<bound method Dataset.info of <xarray.Dataset>
Dimensions:        (cmg_cells_day: 7550, cmg_values: 8, number_of_active_fires: 56, number_of_scan_lines: 2030, pixels_per_scan_line: 1354)
Dimensions without coordinates: cmg_cells_day, cmg_values, number_of_active_fires, number_of_scan_lines, pixels_per_scan_line
Data variables:
    fire mask      (number_of_scan_lines, pixels_per_scan_line) uint8 ...
    algorithm QA   (number_of_scan_lines, pixels_per_scan_line) uint32 ...
    FP_line        (number_of_active_fires) int16 ...
    FP_latitude    (number_of_active_fires) float32 ...
    FP_longitude   (number_of_active_fires) float32 ...
    FP_sample      (number_of_active_fires) int16 ...
    FP_MAD_T21     (number_of_active_fires) float32 ...
    FP_MAD_T31     (number_of_active_fires) float32 ...
    FP_MAD_DT      (number_of_active_fires) float32 ...
    FP_power       (number_of_active_fires) float32 ...
...


Attributes:
    FirePix:                           56
    LandFirePix:                       48
    WaterFirePix:                      8
    MissingPix:                        0
    LandPix:                           2271109
    WaterPix:                          466621
    LandCloudPix:                      492969
    WaterCloudPix:                     24453
    GlintPix:                          236567
    GlintRejectedPix:                  2
    ...

数据数组必须用于创建新的xarray数据集:

ds['FP_power'].values
array([13.103112 ,  8.100937 , 20.465372 ,  5.201196 , 26.389868 ,
       26.044945 ,  4.773266 ,  4.217818 ,  7.6388383,  7.2120876,
        7.6221876,  9.314646 ,  4.6933956,  4.096731 ,  6.0109596,
        3.7509809,  4.2455773,  8.730763 ,  4.4746056,  3.4984534,
        6.315592 ,  6.0503035,  5.17827  ,  5.016279 ,  6.9908423,
       24.340796 ,  6.6315703,  8.113305 ,  8.446644 , 13.976195 ,
       23.003496 , 15.349033 ,  7.1619606, 10.015424 , 19.85352  ,
        9.435441 ,  4.913994 , 10.5239935,  8.5827055, 24.54771  ,
        6.333998 ,  8.9611025, 15.0496235,  5.0109186, 45.045345 ,
        6.057521 ,  7.708698 , 87.54773  ,  7.666769 , 12.325472 ,
       10.527838 , 38.443188 , 57.51008  ,  7.5452023,  9.091777 ,
       14.654143 ], dtype=float32)

ds['FP_latitude'].values
array([41.755886, 40.183777, 39.71829 , 35.75638 , 35.974506, 35.978523,
       33.69351 , 33.69187 , 34.273735, 33.531178, 33.529613, 32.68083 ,
       32.268406, 32.266865, 31.185026, 30.899273, 30.865719, 30.491243,
       30.985224, 30.722086, 30.643255, 30.64147 , 30.752913, 30.402008,
       30.317663, 30.720966, 30.233835, 30.35897 , 30.357391, 30.395046,
       30.39352 , 30.828293, 30.244186, 29.912764, 30.767195, 30.348738,
       30.271055, 30.259432, 30.25789 , 30.684727, 29.880253, 29.819761,
       30.039375, 29.204227, 28.497374, 28.495903, 28.488457, 27.036402,
       27.034937, 27.014181, 25.604158, 25.601257, 25.530325, 25.428734,
       25.426567, 24.116196], dtype=float32)

ds['FP_longitude'].values
array([47.823795, 52.83626 , 47.964443, 52.77716 , 43.763355, 43.757786,
       52.00247 , 52.01317 , 47.59178 , 47.60115 , 47.614666, 52.03568 ,
       50.979088, 50.98952 , 52.536816, 52.01254 , 51.459595, 53.593616,
       50.41951 , 51.96342 , 52.39401 , 52.404938, 50.74352 , 52.403107,
       52.58711 , 49.822586, 52.00671 , 51.150295, 51.160656, 50.606544,
       50.6168  , 47.33145 , 51.534843, 53.297   , 47.32797 , 50.29572 ,
       50.818672, 50.774105, 50.784348, 47.33438 , 52.70218 , 52.33389 ,
       50.639126, 52.75078 , 49.714542, 49.724632, 49.712914, 49.55535 ,
       49.5653  , 49.581978, 54.415924, 54.431797, 53.151176, 52.729736,
       52.742325, 52.7592  ], dtype=float32)

我的代码:

import numpy as np
lat = ds['FP_latitude'].values
long = ds['FP_longitude'].values
power = ds['FP_power'].values
ds = xr.Dataset({'power': xr.Variable(('lon', 'lat'), power),
                 'longitude': xr.Variable('lon', long),
                 'latitude': xr.Variable('lat', lat)})
ds

错误

ValueError: dimensions ('lon', 'lat') must have the same length as the number of data dimensions, ndim=1

这是什么错误?

0 个答案:

没有答案