不幸的是,我对python还是很陌生,现在没有时间进行更深入的了解,因此我无法通过python控制台理解和解决错误显示程序。我正在尝试使用此代码从多个位置的多个netCDF文件中提取数据:
#this is for reading the .nc in the working folder
import glob
#this is reaquired ti read the netCDF4 data
from netCDF4 import Dataset
#required to read and write the csv files
import pandas as pd
#required for using the array functions
import numpy as np
# Record all the years of the netCDF files into a Python list
all_years = []
for file in glob.glob('*.nc'):
print(file)
#reading the files
data = Dataset(file, 'r')
#saving the data variable time
time = data.variables['time']
#saving the year which is written in the file
year = time.units[11:15]
#once we have acquired the data for one year then it will combine it for all the years as we are using for loop here
all_years.append(year)
# Creating an empty Pandas DataFrame covering the whole range of data and then we will read the required data and put it here
year_start = min(all_years)
end_year = max(all_years)
date_range = pd.date_range(start = str(year_start) + '-01-01',
end = str(end_year) + '-12-31',
freq = 'D')
#an empty having 0.0 values dataframe will be created with two columns date_range and temperature
df = pd.DataFrame(0.0, columns = ['Precipitation'], index = date_range)
# Defining the names, lat, lon for the locations of your interest into a csv file
#this will read the file locations
locations = pd.read_csv('stations_locations.csv')
#we would use a for loop as we are interested in aquiring all the information one by one from the rows
for index, row in locations.iterrows():
# one by one we will extract the information from the csv and put it into temp. variables
location = row['names']
location_lat = row['latitude']
location_lon = row['longitude']
# Sorting the all_years just to be sure that model writes the data correctly
all_years.sort()
#now we will read the netCDF file and here I have used netCDF file from FGOALS model
for yr in all_years:
# Reading-in the data
data = Dataset('pr_day_CNRM-CM5_historical_r1i1p1_%s0101-%s1231.nc'%(yr,yr), 'r')
# Storing the lat and lon data of the netCDF file into variables
lat = data.variables['lat'][:]
lon = data.variables['lon'][:]
#as we already have the co-ordinates of the point which needs to be downloaded
#in order to find the closest point around it we need to substract the cordinates
#and check which ever has the minimun distance
# Squared difference between the specified lat,lon and the lat,lon of the netCDF
sq_diff_lat = (lat - location_lat)**2
sq_diff_lon = (lon - location_lon)**2
# Identify the index of the min value for lat and lon
min_index_lat = sq_diff_lat.argmin()
min_index_lon = sq_diff_lon.argmin()
# Accessing the average temparature data
temp = data.variables['pr']
# Creating the date range for each year during each iteration
start = str(yr) + '-01-01'
end = str(yr) + '-12-31'
d_range = pd.date_range(start = start,
end = end,
freq = 'D')
for t_index in np.arange(0, len(d_range)):
print('Recording the value for: ' + str(location)+'_'+ str(d_range[t_index]))
df.loc[d_range[t_index]]['Temparature'] = temp[t_index, min_index_lat, min_index_lon]
df.to_csv(str(location) + '.csv')
这是显示的错误代码:
File "G:\Selection Cannon\Historical\CNRM-CM5_r1i1p1\pr\extracting data_CNRM-CM5_pr.py", line 62, in <module>
data = Dataset('pr_day_CNRM-CM5_historical_r1i1p1_%s0101-%s1231.nc'%(yr,yr), 'r')
File "netCDF4\_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.__init__
File "netCDF4\_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success
FileNotFoundError: [Errno 2] No such file or directory: b'pr_day_CNRM-CM5_historical_r1i1p1_18500101-18501231.nc'
当我检查变量/函数'time.units'时,它说“从1850-1-1开始的天数”,但文件夹中只有1975-2005年的文件。如果我勾选“ all_years”,它将仅显示“ 1850”七次。我认为这与“ year = time.units [11:15]”行有关,但这是youtube视频中的家伙这样做的方式。 有人可以帮我解决这个问题,以便这段代码提取1975年及以后的文件吗?
最好的问候, 亚历克斯
PS:这是我的第一篇文章,如果需要任何补充信息和数据,请告诉我:)
答案 0 :(得分:0)
在进行其他任何操作之前,似乎您没有提供正确的路径。应当类似于“ G:/path/to/pr_day_CNRM-CM5_historical_r1i1p1_18500101-18501231.nc”。