我正在使用带有飞行记录的CSV文件。我的总体目标是绘制选定几天内的航班延误图。我正在尝试按日期和预定起飞时间对这些航班进行索引。因此,我有一个以月/日/年格式表示的航班日期,并且以hhmm格式设置了起飞时间,是否可以在24:00时将该起飞时间列重新格式化为hh:mm格式?那我是否只需将各列加在一起并按它们编制索引?
我尝试将各列加在一起而不重新格式化时间,而且我不确定matplotlib可以为我的绘图识别这种时间格式。
data = pd.read_csv("groundhog_query.csv",parse_dates=[['Flight_Date', 'Scheduled_Dep_Time']])
data.index = data['Flight_Date_Scheduled_Dep_Time']
data
''' 今年,FLIGHT_DATE,DAY_OF_YEAR,Unique_Carrier_ID,Airline_ID,Tail_Number,次航班,Origin_Airport_ID,Origin_Market_ID,ORIGIN_AIRPORT_CODE,Origin_State,Destination_Airport_ID,Destination_Market_ID,DESTINATION_AIRPORT_CODE,Dest_State,Scheduled_Dep_Time,Actual_Dep_Time,Dep_Delay,Pos_Dep_Delay,Scheduled_Arr_Time,Actual_Arr_Time,Arr_Delay,Pos_Arr_Delay,Combined_Arr_Delay,Can_Status, Can_Reason,Div_Status,Scheduled_Elapsed_Time,Actual_Elapsed_Time,Carrier_Delay,Weather_Delay,Natl_Airspace_System_Delay,Security_Delay,Late_Aircraft_Delay,Div_Airport_Landings,Div_Landing_Status,Div_Elapsed_Time,Div_Arrival_Delay,Div_Airport_1_ID,Div_1_Tail_Num,Div_Airport_2_ID,Div_2_Tail_Num,Div_Airport_3_ID,Div_3_Tail_Num,Div_Airport_4_ID,Div_4_Tail_Num,Div_Airport_5_ID,Div_5_Tail_Num 2011,2011-01-24,24,MQ,20398,N717MQ,4527,11278,30852,DCA,VA,14492,34492,RDU,NC,1630,1622.0,-8.0,0.0,1735,1722.0,-13.0, 0.0,-13.0,0,,0,65,60.0 ,,,,,, 0 ,,,,,,,,,,,,, 2011,2011-01-25,25,MQ,20398,N736MQ,4527,11278,30852,DCA,VA,14492,34492,RDU,NC,1630,1624.0,-6.0,0.0,1735,1724.0,-11.0, 0.0,-11.0,0,,0,65,60.0 ,,,,,, 0 ,,,,,,,,,,,,, 2011,2011-01-26,26,MQ,20398,N737MQ,4527,11278,30852,DCA,VA,14492,34492,RDU,NC,1630 ,,,, 1735 ,,,,,, 1,B,0 ,65 ,,,,,,,, 0 ,,,,,,,,,,,,,, 2011,2011-01-27,27,MQ,20398,N721MQ,4527,11278,30852,DCA,VA,14492,34492,RDU,NC,1630,1832.0,122.0,122.0,1735,1936.0,121.0,121.0, 121.0,0,,0,65,64.0,121.0,0.0,0.0,0。 '''
我当前的结果以月/日/年的hhmm格式
答案 0 :(得分:0)
使用以下步骤:
1.阅读CSV而不解析日期。
2.合并“飞行日期”和“计划飞行时间”列。确保将“ Scheduled_Dep_Time”转换为字符串fist(因此为.map(str)),因为默认情况下将其解析为int。
3.使用正确的格式('%Y-%m-%d%H:%M')将字符串转换为日期时间
4.将此新产生的列设置为索引
d = pd.read_csv("groundhog_query.csv")
d['Flight_Date_Scheduled_Dep_Time_string'] = d.Flight_Date.str.cat(' ' + d.Scheduled_Dep_Time.map(str))
d['Flight_Date_Scheduled_Dep_Time'] = pd.to_datetime(d.Flight_Date_Scheduled_Dep_Time_string, format='%Y-%m-%d %H:%M')
d = d.set_index('Flight_Date_Scheduled_Dep_Time')
%指令的参考在这里: https://docs.python.org/3.7/library/datetime.html#strftime-and-strptime-behavior