我正在使用xlsx
加载pandas.read_excel(..., parse_dates=True, index_col=0, ...)
个文件,文件的开头是这样的:
| UTC Time | Alt | ... |
|----------|------|-----|
| 13:18:44 | 1234 | ... |
| 13:18:45 | 1235 | ... |
| 13:18:46 | 1236 | ... |
| 13:18:47 | 1237 | ... |
生成的DataFrame
索引是有效的DateTime
,但是其date()
返回当天,而文件中未提供。
那么,有人会知道一种方法来检测pandas
是否默认为当前日期,而无需使用xlrd
打开文件或将第一列作为解析为简单字符串的附加数据列添加吗?
感谢您的帮助!
这是我得到的一个测试用例,注意使用parse_dates
参数:
>>> import pandas as pd
>>> import xlrd
>>> fn = "test/dfdr_example 2.xlsx"
>>> data_to_retriev = [3, 4]
因此,如果我们打开文件“ raw”,则没有设置日期,只有时间:
>>> wb = xlrd.open_workbook(fn)
>>> for row in wb.sheet_by_index(0).get_rows():
... print(row)
[text:'UTC Time (hh:mm:ss)', text:'2 GPS Groundspeed (knots)', text:'Auto Speed Active (discrete)', text:'Approach Identifier Right (ASCII)']
[xldate:0.5816087962962962, number:207.0, text:'engaged', empty:'']
[xldate:0.5816203703703704, number:208.0, text:'engaged', empty:'']
[xldate:0.5816319444444444, number:210.0, text:'engaged', number:23.0]
[xldate:0.5816435185185186, number:211.0, text:'engaged', empty:'']
[xldate:0.5816550925925926, number:212.0, text:'engaged', empty:'']
[xldate:0.5816666666666667, number:213.0, text:'engaged', empty:'']
[xldate:0.5816782407407407, number:214.0, text:'engaged', number:23.0]
[xldate:0.5816898148148147, number:215.0, text:'engaged', empty:'']
[xldate:0.5817013888888889, number:216.0, text:'engaged', empty:'']
[xldate:0.5817129629629629, number:217.0, text:'engaged', empty:'']
现在以pandas
打开:
>>> df = pd.read_excel(fn, parse_dates=True, index_col=0, use_cols=[0] + data_to_retrieve)
>>> df
2 GPS Groundspeed (knots) Auto Speed Active (discrete) Approach Identifier Right (ASCII)
UTC Time (hh:mm:ss)
2018-11-26 13:57:31 207 engaged NaN
2018-11-26 13:57:32 208 engaged NaN
2018-11-26 13:57:33 210 engaged 23
2018-11-26 13:57:34 211 engaged NaN
2018-11-26 13:57:35 212 engaged NaN
2018-11-26 13:57:36 213 engaged NaN
>>> df.index[0]
Timestamp('2018-11-126 13:57:31')
发生的情况的其他说明:
>>> df = pd.read_excel(fn, parse_date=True, index_col=0, use_cols=[0] + data_to_retrieve)
>>> df1 = pd.read_excel(fn, parse_dates=True, index_col=0, use_cols=[0] + data_to_retrieve)
>>> df.index, df1.index
(Index([13:57:31, 13:57:32, 13:57:33, 13:57:34, 13:57:35, 13:57:36, 13:57:37,
13:57:38, 13:57:39, 13:57:40, 13:57:41, 13:57:42, 13:57:43, 13:57:44,
13:57:45, 13:57:46, 13:57:47, 13:57:48, 13:57:49, 13:57:50, 13:57:51,
13:57:52, 13:57:53, 13:57:54, 13:57:55, 13:57:56, 13:57:57, 13:57:58],
dtype='object', name='UTC Time (hh:mm:ss)'),
DatetimeIndex(['2018-11-26 13:57:31', '2018-11-26 13:57:32',
'2018-11-26 13:57:33', '2018-11-26 13:57:34',
'2018-11-26 13:57:35', '2018-11-26 13:57:36',
'2018-11-26 13:57:37', '2018-11-26 13:57:38',
'2018-11-26 13:57:39', '2018-11-26 13:57:40',
'2018-11-26 13:57:41', '2018-11-26 13:57:42',
'2018-11-26 13:57:43', '2018-11-26 13:57:44',
'2018-11-26 13:57:45', '2018-11-26 13:57:46',
'2018-11-26 13:57:47', '2018-11-26 13:57:48',
'2018-11-26 13:57:49', '2018-11-26 13:57:50',
'2018-11-26 13:57:51', '2018-11-26 13:57:52',
'2018-11-26 13:57:53', '2018-11-26 13:57:54',
'2018-11-26 13:57:55', '2018-11-26 13:57:56',
'2018-11-26 13:57:57', '2018-11-26 13:57:58'],
dtype='datetime64[ns]', name='UTC Time (hh:mm:ss)', freq=None))
因此,我们看到pandas
以某种方式设置了日期;我如何测试它默认为当前日期?