我有一个类似于此的.xls
文件...
Value of Construction Put in Place...
(Millions of Dollars....)
Blank Row
Date Total_Construction Total Residential Total Nonresidential...Columns
Dec-15 1,116,570 435,454 681,217
Nov-15 1,115,966 432,295 683,671
Oct-15 1,122,749 431,164 691,585
.
.
.
我正在尝试导入该文件以获取以下内容:
Date Total_Construction Total Residential Total Nonresidential
Dec-15 1,116,570 435,454 681,217
Nov-15 1,115,966 432,295 683,671
Oct-15 1,122,749 431,164 691,585
.
.
.
使用以下代码:
for chunk in pandas.read_csv('/PATH/totsatime.xls',
names = ['Date', 'Total Residential', 'Total Nonresidential'],
header = 4,
chunksize = 1,
skiprows = range(1, 4),
thousands = ','):
if chunk['Date'] == 'Dec-01':
break
else:
df = pandas.DataFrame(chunk)
但是,我最终得到以下结论:
Date Total Residential Total Nonresidential
Lodging NaN NaN
Office NaN NaN
Commercial NaN NaN
Health care NaN NaN
日期最终会从我未导入的列进行格式化。任何建议都将不胜感激。
提前谢谢你。
答案 0 :(得分:5)
请勿使用read_excel
导入xls文件。使用create table if not exists forecast
(
location text,
latitude varchar,
longitude varchar,
time_forecast varchar,
forecast_request varchar,
swflx varchar,
temp varchar,
rh varchar,
PRIMARY KEY((location), time_forecast)
) WITH CLUSTERING ORDER BY (time_forecast ASC);
。见popover(options)