pandas
无法读取文字如下:
NothGrassland Meteor Sites
MTCLIM v4.3 OUTPUT FILE : Mon Jun 26 16:57:31 2017
year yday Tmax Tmin Tday prcp VPD srad daylen
(deg C) (deg C) (deg C) (cm) (Pa) (W m-2) (s)
1961 1 -24.08 -36.19 -27.41 0.00 36.81 128.45 28460
1961 2 -16.08 -29.79 -19.85 0.02 75.12 135.12 28524
1961 3 -16.08 -26.19 -18.86 0.05 65.86 118.79 28594
1961 4 -23.58 -33.29 -26.25 0.00 34.87 116.98 28668
1961 5 -24.28 -37.49 -27.91 0.00 37.27 163.75 28748
1961 6 -20.68 -33.19 -24.12 0.01 49.79 133.63 28832
1961 7 -19.48 -31.29 -22.73 0.18 53.78 131.91 28922
阅读文本时使用代码如下:
df=pd.read_csv(file,sep=' ',header=0,skiprows=[0,1,3])
提示错误:
runfile('C:/temp/python/Models/GSI.py', wdir='C:/temp/python')
Traceback (most recent call last):
File "<ipython-input-115-7bbdd08f49f8>", line 1, in <module>
runfile('C:/temp/python/Models/GSI.py', wdir='C:/temp/python')
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\spyder\utils\site\sitecustomize.py", line 880, in runfile
execfile(filename, namespace)
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/temp/python/Models/GSI.py", line 23, in <module>
df=pd.read_csv(file,header=0,sep=' ')
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\pandas\io\parsers.py", line 646, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\pandas\io\parsers.py", line 401, in _read
data = parser.read()
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\pandas\io\parsers.py", line 939, in read
ret = self._engine.read(nrows)
File "C:\Program Files\Winpython\python-3.6.1.amd64\lib\site-packages\pandas\io\parsers.py", line 1508, in read
data = self._reader.read(nrows)
File "pandas\parser.pyx", line 848, in pandas.parser.TextReader.read (pandas\parser.c:10415)
File "pandas\parser.pyx", line 870, in pandas.parser.TextReader._read_low_memory (pandas\parser.c:10691)
File "pandas\parser.pyx", line 924, in pandas.parser.TextReader._read_rows (pandas\parser.c:11437)
File "pandas\parser.pyx", line 911, in pandas.parser.TextReader._tokenize_rows (pandas\parser.c:11308)
File "pandas\parser.pyx", line 2024, in pandas.parser.raise_parser_error (pandas\parser.c:27037)
CParserError: Error tokenizing data. C error: Expected 10 fields in line 3, saw 34
如果删除sep=' '
,请执行以下操作:
df=pd.read_csv(file,header=None,skiprows=4)
代码运行。
答案 0 :(得分:2)
对我来说,作品sep="\s+"
或delim_whitespace=True
:
import pandas as pd
from pandas.compat import StringIO
temp=u"""NothGrassland Meteor Sites
MTCLIM v4.3 OUTPUT FILE : Mon Jun 26 16:57:31 2017
year yday Tmax Tmin Tday prcp VPD srad daylen
(deg C) (deg C) (deg C) (cm) (Pa) (W m-2) (s)
1961 1 -24.08 -36.19 -27.41 0.00 36.81 128.45 28460
1961 2 -16.08 -29.79 -19.85 0.02 75.12 135.12 28524
1961 3 -16.08 -26.19 -18.86 0.05 65.86 118.79 28594
1961 4 -23.58 -33.29 -26.25 0.00 34.87 116.98 28668
1961 5 -24.28 -37.49 -27.91 0.00 37.27 163.75 28748
1961 6 -20.68 -33.19 -24.12 0.01 49.79 133.63 28832
1961 7 -19.48 -31.29 -22.73 0.18 53.78 131.91 28922"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep="\s+", skiprows=[0,1,3], header=0)
print (df)
year yday Tmax Tmin Tday prcp VPD srad daylen
0 1961 1 -24.08 -36.19 -27.41 0.00 36.81 128.45 28460
1 1961 2 -16.08 -29.79 -19.85 0.02 75.12 135.12 28524
2 1961 3 -16.08 -26.19 -18.86 0.05 65.86 118.79 28594
3 1961 4 -23.58 -33.29 -26.25 0.00 34.87 116.98 28668
4 1961 5 -24.28 -37.49 -27.91 0.00 37.27 163.75 28748
5 1961 6 -20.68 -33.19 -24.12 0.01 49.79 133.63 28832
6 1961 7 -19.48 -31.29 -22.73 0.18 53.78 131.91 28922
还有:
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), delim_whitespace=True, skiprows=[0,1,3], header=0)
print (df)
year yday Tmax Tmin Tday prcp VPD srad daylen
0 1961 1 -24.08 -36.19 -27.41 0.00 36.81 128.45 28460
1 1961 2 -16.08 -29.79 -19.85 0.02 75.12 135.12 28524
2 1961 3 -16.08 -26.19 -18.86 0.05 65.86 118.79 28594
3 1961 4 -23.58 -33.29 -26.25 0.00 34.87 116.98 28668
4 1961 5 -24.28 -37.49 -27.91 0.00 37.27 163.75 28748
5 1961 6 -20.68 -33.19 -24.12 0.01 49.79 133.63 28832
6 1961 7 -19.48 -31.29 -22.73 0.18 53.78 131.91 28922