我有一个看起来像这样的dataset.csv文件
time, cost,volume,valid
Fri May 19 10:00:00 PDT 2017, 9.1,3.2,True
Fri May 19 11:03:09 PDT 2017, 5.2,4.2,False
请帮助解析这个数据集,使得数据类型为:column1:date,column2:float,column3:float,column4:boolean
由于 CG
答案 0 :(得分:1)
您可以使用read_csv
参数skipinitialspace
和parse_dates
:
import pandas as pd
from pandas.compat import StringIO
temp=u"""time, cost,volume,valid
Fri May 19 10:00:00 PDT 2017, 9.1,3.2,True
Fri May 19 11:03:09 PDT 2017, 5.2,4.2,False"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), skipinitialspace=True, parse_dates=[0])
print (df)
time cost volume valid
0 2017-05-19 10:00:00 9.1 3.2 True
1 2017-05-19 11:03:09 5.2 4.2 False
print (df.dtypes)
time datetime64[ns]
cost float64
volume float64
valid bool
dtype: object