我有一张格式化的CSV:
"Year","Month","Day","Hour","Minute","Direct","Diffuse","D_Global","D_IR","U_Global","U_IR","Zenith"
2001,3,1,0,1,0.28,84.53,83.53,224.93,76.67,228.31,80.031
2001,3,1,0,2,0.15,84.24,83.25,224.76,76.54,228.62,80.059
2001,3,1,0,3,0.16,84.63,83.43,225.62,76.76,229.06,80.087
2001,3,1,0,4,0.20,85.20,83.99,226.56,77.15,228.96,80.115
我的剧本是:
df1 = pd.read_csv(input_file,
sep = ",",
parse_dates = {'Date': [0,1,2,3,4]},
date_parser = lambda x: pd.to_datetime(x, format="%Y %m %d %H %M"),
index_col = ['Date'])
我得到的错误是:
Traceback (most recent call last):
File "convertCSVtoNC.py", line 70, in <module>
openFile(sys.argv[1:])
File "convertCSVtoNC.py", line 30, in openFile
df2 = createDataFrame(input_file, counter)
File "convertCSVtoNC.py", line 43, in createDataFrame
index_col = ['Date'])
...
TypeError: <lambda>() takes exactly 1 argument (5 given)
脚本在302个以前的输入中运行正常,示例格式为:
"Year","Month","Day","Hour","Minute","Direct","Diffuse","D_Global","D_IR","U_Global","U_IR","Zenith"
1976,1,1,0,3,-999.00,-999.00,-999.00,-999.00,-999.00,-999.00,95.751
1976,1,1,0,6,-999.00,-999.00,-999.00,-999.00,-999.00,-999.00,95.839
1976,1,1,0,9,-999.00,-999.00,-999.00,-999.00,-999.00,-999.00,95.930
1976,1,1,0,12,-999.00,-999.00,-999.00,-999.00,-999.00,-999.00,96.023
任何想法为什么?
答案 0 :(得分:1)
它适用于我:
df1 = pd.read_csv(file_name, parse_dates={'Date':[0,1,2,3,4]},
date_parser=lambda x: pd.to_datetime(x, format='%Y %m %d %H %M'),
index_col=['Date']))
In [215]: df1
Out[215]:
Direct Diffuse D_Global D_IR U_Global U_IR Zenith
Date
2001-03-01 00:01:00 0.28 84.53 83.53 224.93 76.67 228.31 80.031
2001-03-01 00:02:00 0.15 84.24 83.25 224.76 76.54 228.62 80.059
2001-03-01 00:03:00 0.16 84.63 83.43 225.62 76.76 229.06 80.087
2001-03-01 00:04:00 0.20 85.20 83.99 226.56 77.15 228.96 80.115
PS我正在使用Pandas 0.19.1
答案 1 :(得分:0)
原来我的输入csv文件末尾有一些新的换行符。我认为这是有道理的,因为我的lambda函数将整行作为输入。也许在未来的lambda相关问题中寻找一些东西。