我读了一个带有单独日期和时间列的csv。我通过以下方式从它们生成索引:
data = p.read_csv(fileName,usecols=["date","time","price"])
data.set_index(["date","time"],inplace=True)
但是,当我想要获得行之间的天数或小时数时,这并不是非常有用。如何从单独的日期和时间列生成单个日期时间索引?
答案 0 :(得分:1)
我认为您需要带有嵌套列表的参数parse_dates
,其中包含列和参数index_col
,新列由由_
分隔的并列列名创建:
data = p.read_csv(fileName,
usecols=["date","time","price"],
parse_dates=[["date","time"]],
index_col=['date_time'])
样品:
from pandas.compat import StringIO
temp=u"""date,time,price
2015-01-01,14:00:10,7
2014-01-01,10:20:10,1"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp),
usecols=["date","time","price"],
parse_dates=[["date","time"]],
index_col=['date_time'])
print (df)
price
date_time
2015-01-01 14:00:10 7
2014-01-01 10:20:10 1
print (df.index)
DatetimeIndex(['2015-01-01 14:00:10', '2014-01-01 10:20:10'],
dtype='datetime64[ns]',
name='date_time', freq=None)