我正在尝试从netCDF文件中提取数据,以获取代表动物运动的带时间戳的GPS位置:
gps <- structure(list(Lon = c(-3.179046, -3.180403, -3.187734, -3.190197,
-3.182058, -3.181318, -3.181994, -3.181549, -3.178166, -3.177124,
-3.175182, -3.174655, -3.174565, -3.175529, -3.175918, -3.174718,
-3.173933, -3.173839, -3.173253, -3.172965), Lat = c(58.654169,
58.654831, 58.661016, 58.664087, 58.674328, 58.674753, 58.675205,
58.67472, 58.673246, 58.672343, 58.671392, 58.670393, 58.66996,
58.669358, 58.669753, 58.670683, 58.670706, 58.671087, 58.670705,
58.669429), datetime = structure(c(1475252738, 1475252983, 1475266971,
1475269367, 1475298968, 1475299231, 1475299528, 1475299820, 1475300050,
1475300307, 1475300499, 1475300785, 1475301021, 1475301282, 1475301568,
1475301836, 1475302130, 1475302390, 1475302676, 1475302975), class = c("POSIXct",
"POSIXt"), tzone = "")), .Names = c("Lon", "Lat", "datetime"), class = "data.frame", row.names = 2:21)
我有73个netCDF文件,每个文件包含5天的数据,以构成日历年。每个文件有120个时间'层'代表一个小时,每个单元格有10个z值。我想在时间t的位置xy为我的gps数据框中的每一行提取特定变量的数据。每个netCDF文件在文件名中都有一个起始时间,例如:"PFOW_Climatology2_0001_2016-01-01.nc"
所以我可以通过在gps数据帧中创建更多索引值来引用一行到特定文件。到目前为止,我已设法获得正确的文件和引用层:
filenames <- list.files(path=getwd())
x <- lapply(filenames, nc_open)
fd <- as.Date(substr(filenames, 24, 36)) # to get the reference date from the nc file
dd <- as.Date(gps$datetimeNC) # reduce datetime to just a date so we can reference the correct file
i <- findInterval(dd, fd) # identify which file in filenames the gps data for each row corresponds to
gps$file <- i
gps$rasterlayer <- paste("X", as.numeric(format(gps$dateNC,'%Y')),".", format(gps$dateNC,'%m'), ".",
format(gps$dateNC,'%d'), ".", as.numeric(format(gps$datetime, "%H")), sep="") # Creates a naming system for each row in the gps data frame that corresponds to the names of the raster layer
v <- extract(brick(filenames[gps$file[1]], stopIfNotEqualSpaced=FALSE), gps[1 , c('Lon', 'Lat')],
layer=match(gps$rasterlayer[1], names(brick(filenames[gps$file[1]], stopIfNotEqualSpaced=FALSE))))
这会将相应的nc文件提取为栅格砖,并使用之前创建的rasterlayer索引将正确的日期时间戳与砖块中的正确图层进行匹配。提取是我正在工作的一个函数,我将应用于数据框中的每一行。我遇到的两个问题是:
1)我的nc文件位于非结构化网格上。无论如何我使用stopIfNotEqualSpaced=FALSE
命令对它进行光栅化,但是我知道这些值不正确,因为不仅单元格间隔不均匀,它们的大小也不同。有没有办法从未经检查的网格中提取数据?如果我在没有'StopIfNotEqualSpaced = FALSE`命令的情况下尝试此操作,我会收到错误:
Error in .rasterObjectFromCDF(x, type = objecttype, band = band, ...) :
cells are not equally spaced; you should extract values as points
我很乐意将这些值提取为点,但我不知道如何。
2)如何更改我想要提取数据的变量?目前它只选择1,默认变量并为其提供数据但是我希望能够更改变量输出,以便我可以提取nc
文件中的61个变量中的任何一个。
很抱歉不包含netCDF文件作为示例,但它们各为7.5 GB。尺寸看起来如此:
10 dimensions:
nele Size:148567
node Size:77950
siglay Size:10
long_name: Sigma Layers
standard_name: ocean_sigma/general_coordinate
positive: up
valid_min: -1
valid_max: 0
formula_terms: sigma: siglay eta: zeta depth: h
siglev Size:11
long_name: Sigma Levels
standard_name: ocean_sigma/general_coordinate
positive: up
valid_min: -1
valid_max: 0
formula_terms: sigma:siglay eta: zeta depth: h
three Size:3
time Size:120 *** is unlimited ***
long_name: time
units: days since 1858-11-17 00:00:00
format: modified julian day (MJD)
time_zone: UTC
DateStrLen Size:26
maxnode Size:11
maxelem Size:9
four Size:4
它有61个变量但是为了这个目的,我刚刚包含了一些变量。我想要提取的变量是u
和v
:
61 variables (excluding dimension variables):
float lon[node]
long_name: nodal longitude
standard_name: longitude
units: degrees_east
float lat[node]
long_name: nodal latitude
standard_name: latitude
units: degrees_north
char Times[DateStrLen,time]
time_zone: GMT
float u[nele,siglay,time]
long_name: Eastward Water Velocity
standard_name: eastward_sea_water_velocity
units: meters s-1
grid: fvcom_grid
type: data
coordinates: time siglay latc lonc
mesh: fvcom_mesh
location: face
float v[nele,siglay,time]
long_name: Northward Water Velocity
standard_name: Northward_sea_water_velocity
units: meters s-1
grid: fvcom_grid
type: data
coordinates: time siglay latc lonc
mesh: fvcom_mesh
location: face