Question

我正在尝试将netcdf（* .nc）（经典格式）转换为CSV。该文件来自NOAA降水量数据集。

我在this post中找到了有用的代码；但是，当我运行它时，出现此异常：

回溯（最近一次通话最后一次）：文件“ ./test2.py”，第31行，在 precip_ts = pd.Series（precip，index = dtime）文件“ /usr/local/lib/python2.7/site-packages/pandas/core/series.py”，行 275，在 init 中 raise_cast_failure = True）文件“ /usr/local/lib/python2.7/site-packages/pandas/core/series.py”，行 4165，在_sanitize_array中引发异常（“数据必须是一维的”）异常：数据必须是一维的

这是test2.py脚本（与上面引用的帖子相同）：

#!/usr/local/bin/python2.7

import netCDF4
import pandas as pd

precip_nc_file = 'precip.V1.0.2006.nc'
nc = netCDF4.Dataset(precip_nc_file, mode='r')
nc.variables.keys()

lat = nc.variables['lat'][:]
lon = nc.variables['lon'][:]
time_var = nc.variables['time']
dtime = netCDF4.num2date(time_var[:],time_var.units)
precip = nc.variables['precip'][:]

# a pandas.Series designed for time series of a 2D lat,lon grid
precip_ts = pd.Series(precip, index=dtime)

precip_ts.to_csv('precip.csv',index=True, header=True)

在“熊猫系列”电话中失败。你能给我任何指导为什么大熊猫失灵吗？我以为应该处理2D数据！

我要查找的最终结果是一个CSV文件，每行都有lon，lat，datetime，precip值

Answer 1

在这里，pd.Series()似乎期望使用一维对象。而整个蒙版阵列大于一维。因此，要访问感兴趣的数组部分，可以通过'.data'等添加。下面的代码显示了如何将precip_ts保存到csv。对于从here（.nc）下载的'precip.V1.0.2006.nc'文件的结构，我还不太了解。因为结果序列中元素的数量不相等。因此，很难知道哪些值与其他值在同一行。例如：lat具有120个值，而lon具有300个值。另一方面，如果所有数组的长度都相同，则可以将它们组合成单个pandas数据框，然后另存为csv文件（在最底部编码）。

导入库

import netCDF4
import pandas as pd
import numpy.ma as ma

下面重复出现的问题中的代码

lat = nc.variables['lat'][:]
lon = nc.variables['lon'][:]
time_var = nc.variables['time']
dtime = netCDF4.num2date(time_var[:],time_var.units)
precip = nc.variables['precip'][:]

precip_nc_file = 'precip.V1.0.2006.nc'
nc = netCDF4.Dataset(precip_nc_file, mode='r')
nc.variables.keys()

下面的编辑行：将dtime替换为dtime.data以访问被屏蔽的数组

precip_ts = pd.Series(dtime.data, index=dtime)

另存为.csv

precip_ts.to_csv('precip.csv',index=True, header=True)

仅当不同系列的所有值都具有相同的长度时，下面的代码才可以用于将它们全部保存到单个数据帧中。这对我不起作用，因为下载的文件创建了一系列不相等的长度。

df = pd.DataFrame({
    'lat': lat,
    'lon': lon,
    'dtime': dtime,
    'precip': precip 
})
df.head(2)

使用Python将降水数据netcdf文件转换为csv

1 个答案: