Python Pandas read_csv无法正确导入

时间:2016-04-27 18:59:19

标签: python pandas

我有一个类似于此的.xls文件...

Value of Construction Put in Place...
(Millions of Dollars....)
Blank Row
Date    Total_Construction Total Residential Total Nonresidential...Columns 
Dec-15  1,116,570          435,454           681,217 
Nov-15  1,115,966          432,295           683,671
Oct-15  1,122,749          431,164           691,585   
.
.
.

我正在尝试导入该文件以获取以下内容:

Date    Total_Construction Total Residential Total Nonresidential 
Dec-15  1,116,570          435,454           681,217 
Nov-15  1,115,966          432,295           683,671
Oct-15  1,122,749          431,164           691,585   
.
.
. 

使用以下代码:

for chunk in pandas.read_csv('/PATH/totsatime.xls',
                 names      = ['Date', 'Total Residential', 'Total Nonresidential'],
                 header     = 4,
                 chunksize  = 1,
                 skiprows   = range(1, 4),
                 thousands  = ','):

    if chunk['Date'] == 'Dec-01':
        break

    else:
        df = pandas.DataFrame(chunk)

但是,我最终得到以下结论:

Date             Total Residential     Total Nonresidential
Lodging          NaN                   NaN
Office          NaN                   NaN
Commercial      NaN                   NaN
Health care     NaN                   NaN

日期最终会从我未导入的列进行格式化。任何建议都将不胜感激。

提前谢谢你。

1 个答案:

答案 0 :(得分:5)

请勿使用read_excel导入xls文件。使用create table if not exists forecast ( location text, latitude varchar, longitude varchar, time_forecast varchar, forecast_request varchar, swflx varchar, temp varchar, rh varchar, PRIMARY KEY((location), time_forecast) ) WITH CLUSTERING ORDER BY (time_forecast ASC); 。见popover(options)