我在csv文件中有一个统计信息,有些是具有数千行的巨大文件。结构是:
"Result : Stat01"
"Save Time: 09/23/2019 19:01:27"
"User Name:admin"
"Total 1,365 Records"
"Start Time","Period","Messages Received","Messages Sent"
09/23/2019 01:30:00,5,114,57
09/23/2019 01:30:00,5,0,0
09/23/2019 01:30:00,5,47493,46911
09/23/2019 01:30:00,5,47772,46347
09/23/2019 01:30:00,5,0,0
09/23/2019 01:35:00,5,32990,34652
09/23/2019 01:35:00,5,142,63
09/23/2019 01:35:00,5,0,0
09/23/2019 01:35:00,5,47379,46297
09/23/2019 01:35:00,5,46324,45750
09/23/2019 01:35:00,5,0,0
09/23/2019 01:40:00,5,31974,33969
09/23/2019 01:40:00,5,114,57
09/23/2019 01:40:00,5,0,0
09/23/2019 01:40:00,5,44701,43845
09/23/2019 01:40:00,5,44903,43770
09/23/2019 01:40:00,5,0,0
09/23/2019 01:45:00,5,33531,35274
09/23/2019 01:45:00,5,126,63
09/23/2019 01:45:00,5,0,0
09/23/2019 01:45:00,5,45821,43960
09/23/2019 01:45:00,5,44988,45120
09/23/2019 01:45:00,5,0,0
09/23/2019 01:50:00,5,32544,33804
09/23/2019 01:50:00,5,112,56
09/23/2019 01:50:00,5,0,0
09/23/2019 01:50:00,5,45645,44609
09/23/2019 01:50:00,5,44878,44628
我尝试使用 parse_dates 和 date_parser 在熊猫中进行解析,但是pandas DataFrame中的结果只是日期,它跳过了时间。统计信息有5分钟的频率,需要时间。 使用的代码是
mydateparser = lambda x: pd.datetime.strptime(x, "%m/%d/%Y %H:%M:%S")
sta = pd.read_csv('Export.csv',skiprows=7,parse_dates=["Start Time"],date_parser= mydateparser)
sta.head()
输出没有时间:
Start Time Period Messages Received Messages Sent
0 2019-09-23 5 46803 49665
1 2019-09-23 5 112 56
2 2019-09-23 5 0 0
3 2019-09-23 5 66647 65771
4 2019-09-23 5 67151 65191
感谢您的帮助