Gremlin查询以查找在边缘属性上动态过滤的转机预定飞行路线

时间:2019-01-23 22:26:00

标签: graph-databases gremlin

我正在使用预定的航班路线,在该路线中,我有一个日期范围,该航班可以在一周的给定日期运行,因此我需要帮助制定涵盖多个用例的gremlin查询。我正在使用的时间表航线数据如下,我有大约4000个机场和500万条预定航线。预定路线如下:

Carrier: AA
Flight: 100
Service date from: 2019-02-01 
Service date to: 2019-07-02
Departure: PDX
Arrival: LHR
FlysMonday: true
FlysTuesday: false
...
FlysSunday:false
DepartureDay:0
ArrivalDay:1 <-- overnight flight

支持用例的克里姆林宫查询:

  1. 能够根据一些标准选择出发的航班,例如:承运人AA,从HNL到PDX的出发日期为2019-1-24 08:00以及12:00之前。
  2. 查找在之前的航班到达时间之后有出发时间的转机航班,其转机时间最少为90分钟。
  3. 处理下一个转机航班可能会在x天后起飞的情况。
  4. 累计旅行所花费的总时间,即从出发到到达的总时间,包括中途停留的时间
  5. 查找从出发机场到到达机场的直飞航班和所有非直飞航班。
  6. 仅在特定机场飞过中途停留,即。 LIA通过IAD到HNL。

到目前为止,我已经有了它,当我对4000个顶点(机场)和500万条边线(计划路线)的完整图形运行时,它的运行速度非常慢。我认为部分问题是我在遍历之后放置了过滤。

g.V().has('airport', 'name', 'HNL').as('depAirport').repeat(outE().as('flight').inV().as('stop').simplePath()).times(2).emit().has('airport', 'name', 'LHR').filter(select(first, 'flight').and(has('Carrier', 'AA'), has('DepartureTime', gte(Date.parse(timeFormat,'08:00:00'))), has('ServiceDateFrom', lte(Date.parse(dateFormat,'2019-01-24'))), has('ServiceDateTo', gte(Date.parse(dateFormat,'2019-01-24'))))).path().by('name').by(valueMap())

我也不确定如何制定查询条件,以便可以支持用例2、3、4和6。如果不确定之间的非直接飞行,我不确定如何仅在第一次飞行之后应用重复出发和到达机场。

我尝试了以下尝试,尝试在重复之前先修剪起飞航班,但没有像其他查询那样产生任何结果。

g.V().has('airport', 'name', 'HNL').as('depAirport').outE().as('flight').and(has('Carrier', 'AA'), has('DepartureTime', gte(Date.parse(timeFormat,'08:00:00'))), has('ServiceDateFrom', lte(Date.parse(dateFormat,'2019-01-24'))), has('ServiceDateTo', gte(Date.parse(dateFormat,'2019-01-24')))).repeat(inV().as('stop').outE().as('flight').simplePath()).times(2).emit().has('airport', 'name', 'LHR').path().by('name').by(valueMap())

这是图形设置代码:

dateFormat = "yyyy-MM-dd"
timeFormat = "hh:mm:ss"

graph = TinkerFactory.createModern()
g = graph.traversal()
g.addV('airport').property('name','PDX').as('PDX').
  addV('airport').property('name','JFK').as('JFK').
  addV('airport').property('name','HNL').as('HNL').
  addV('airport').property('name','ORD').as('ORD').
  addV('airport').property('name','IAD').as('IAD').
  addV('airport').property('name','LHR').as('LHR').
  addV('airport').property('name','CAN').as('CAN').
  addV('airport').property('name','LAX').as('LAX').
  addE('flight').from('HNL').to('PDX').property('Carrier', 'AA').property('FlightNumber', '100').property('ServiceDateFrom', Date.parse(dateFormat, '2019-01-23')).property('ServiceDateTo', Date.parse(dateFormat, '2019-03-20')).property('DepartureTime', Date.parse(timeFormat, '08:00:00')).property('ArrivalTime', Date.parse(timeFormat, '13:00:00')).property('DepartureDay', 0).property('ArrivalDay','0').property('FlysSunday', true).property('FlysMonday', false).property('FlysTuesday', false).property('FlysWednesday', false).property('FlysThursday', false).property('FlysFriday', false).property('FlysSaturday', true).
  addE('flight').from('HNL').to('PDX').property('Carrier', 'AA').property('FlightNumber', '201').property('ServiceDateFrom', Date.parse(dateFormat, '2019-01-23')).property('ServiceDateTo', Date.parse(dateFormat, '2019-03-20')).property('DepartureTime', Date.parse(timeFormat, '08:00:00')).property('ArrivalTime', Date.parse(timeFormat, '13:00:00')).property('DepartureDay', 0).property('ArrivalDay','0').property('FlysSunday', false).property('FlysMonday', true).property('FlysTuesday', true).property('FlysWednesday', true).property('FlysThursday', true).property('FlysFriday', true).property('FlysSaturday', false).
  addE('flight').from('PDX').to('LHR').property('Carrier', 'BA').property('FlightNumber', '100').property('ServiceDateFrom', Date.parse(dateFormat, '2019-01-31')).property('ServiceDateTo', Date.parse(dateFormat, '2019-03-05')).property('DepartureTime', Date.parse(timeFormat, '13:30:00')).property('ArrivalTime', Date.parse(timeFormat, '23:00:00')).property('DepartureDay', 0).property('ArrivalDay','0').property('FlysSunday', true).property('FlysMonday', false).property('FlysTuesday', false).property('FlysWednesday', false).property('FlysThursday', false).property('FlysFriday', false).property('FlysSaturday', true).
  addE('flight').from('PDX').to('LHR').property('Carrier', 'BA').property('FlightNumber', '201').property('ServiceDateFrom', Date.parse(dateFormat, '2019-02-05')).property('ServiceDateTo', Date.parse(dateFormat, '2019-03-17')).property('DepartureTime', Date.parse(timeFormat, '13:30:00')).property('ArrivalTime', Date.parse(timeFormat, '23:00:00')).property('DepartureDay', 0).property('ArrivalDay','0').property('FlysSunday', true).property('FlysMonday', false).property('FlysTuesday', false).property('FlysWednesday', false).property('FlysThursday', false).property('FlysFriday', false).property('FlysSaturday', true).
  addE('flight').from('PDX').to('LHR').property('Carrier', 'BA').property('FlightNumber', '202').property('ServiceDateFrom', Date.parse(dateFormat, '2019-02-05')).property('ServiceDateTo', Date.parse(dateFormat, '2019-03-17')).property('DepartureTime', Date.parse(timeFormat, '16:00:00')).property('ArrivalTime', Date.parse(timeFormat, '02:00:00')).property('DepartureDay', 0).property('ArrivalDay','1').property('FlysSunday', true).property('FlysMonday', false).property('FlysTuesday', false).property('FlysWednesday', false).property('FlysThursday', false).property('FlysFriday', false).property('FlysSaturday', true).
  addE('flight').from('PDX').to('LHR').property('Carrier', 'BA').property('FlightNumber', '203').property('ServiceDateFrom', Date.parse(dateFormat, '2019-02-05')).property('ServiceDateTo', Date.parse(dateFormat, '2019-03-17')).property('DepartureTime', Date.parse(timeFormat, '16:00:00')).property('ArrivalTime', Date.parse(timeFormat, '02:00:00')).property('DepartureDay', 0).property('ArrivalDay','1').property('FlysSunday', false).property('FlysMonday', true).property('FlysTuesday', true).property('FlysWednesday', true).property('FlysThursday', true).property('FlysFriday', true).property('FlysSaturday', false).
  addE('flight').from('ORD').to('PDX').property('Carrier', 'CC').property('FlightNumber', '66').property('ServiceDateFrom', Date.parse(dateFormat, '2019-08-11')).property('ServiceDateTo', Date.parse(dateFormat, '2019-12-11')).property('DepartureTime', Date.parse(timeFormat, '06:00:00')).property('ArrivalTime', Date.parse(timeFormat, '12:00:00')).property('DepartureDay', 0).property('ArrivalDay','0').property('FlysSunday', true).property('FlysMonday', true).property('FlysTuesday', true).property('FlysWednesday', true).property('FlysThursday', true).property('FlysFriday', true).property('FlysSaturday', false).
  addE('flight').from('ORD').to('LAX').property('Carrier', 'CC').property('FlightNumber', '76').property('ServiceDateFrom', Date.parse(dateFormat, '2019-08-11')).property('ServiceDateTo', Date.parse(dateFormat, '2019-12-11')).property('DepartureTime', Date.parse(timeFormat, '06:00:00')).property('ArrivalTime', Date.parse(timeFormat, '12:00:00')).property('DepartureDay', 0).property('ArrivalDay','0').property('FlysSunday', true).property('FlysMonday', true).property('FlysTuesday', true).property('FlysWednesday', true).property('FlysThursday', true).property('FlysFriday', true).property('FlysSaturday', false).
  addE('flight').from('LAX').to('CAN').property('Carrier', 'CC').property('FlightNumber', '12').property('ServiceDateFrom', Date.parse(dateFormat, '2019-03-11')).property('ServiceDateTo', Date.parse(dateFormat, '2019-12-24')).property('DepartureTime', Date.parse(timeFormat, '15:00:00')).property('ArrivalTime', Date.parse(timeFormat, '05:00:00')).property('DepartureDay', 0).property('ArrivalDay','1').property('FlysSunday', false).property('FlysMonday', false).property('FlysTuesday', true).property('FlysWednesday', false).property('FlysThursday', false).property('FlysFriday', false).property('FlysSaturday', true).
  addE('flight').from('PDX').to('CAN').property('Carrier', 'CC').property('FlightNumber', '22').property('ServiceDateFrom', Date.parse(dateFormat, '2019-03-11')).property('ServiceDateTo', Date.parse(dateFormat, '2019-12-24')).property('DepartureTime', Date.parse(timeFormat, '15:00:00')).property('ArrivalTime', Date.parse(timeFormat, '06:00:00')).property('DepartureDay', 0).property('ArrivalDay','1').property('FlysSunday', false).property('FlysMonday', false).property('FlysTuesday', true).property('FlysWednesday', false).property('FlysThursday', false).property('FlysFriday', false).property('FlysSaturday', true).iterate()

示例查询和响应:

从HNL到LHR的路线于2019年1月24日(星期四)从08:00:00出发,但不得迟于12:00:00,且最短连接时间为90分钟,应返回以下信息:

[DepartureAirport: HNL, Flight: AA-201, ConnectingAirport: PDX, Flight: BA-203, ArrivalAirport: LHR, TravelTime: 18hours] <= this is the best route, it meets the minimum connect time buffer of over 90 minutes and has the overall shortest travel time of 18hours which consists of 5hours from HNL to PDX plus 3 hours layover plus 10 hours into LHR. 
[DepartureAirport: HNL, Flight: AA-201, ConnectingAirport: PDX, Flight: BA-201, ArrivalAirport: LHR, TravelTime: 39.5hours] <= this route works but there's a layover in PDX from Thursday until Friday for the BA-201 flight because the same day connecting flight's departure time doesn't meet the minimum connect time buffer from the first leg's arrival time. Total travel time of 39.5 hours consists of 5hours from HNL to PDX, 24.5 hours layover, and 10 hours to LHR.
[DepartureAirport: HNL, Flight: AA-201, ConnectingAirport: PDX, Flight: BA-202, ArrivalAirport: LHR, TravelTime: 90 hours] <= this route works but there's a layover in PDX from Thursday until Sunday for the BA-203 flight. Total travel time of 90 hours consists of 5hours from HNL to PDX, 75hours layover and 10 hours to LHR. I am interested in these as well because there are cases we are routing to  remote airports with infrequent flights. 

从PDX到CAN的航线将于2019-03-19(星期二)从16:00:00出发,但不迟于20:00,且最短连接时间为60分钟,应返回这些直接航班,因为示例图仅包含直接航班此路线的航班:

[DepartureAirport: PDX, Flight: CC-22, ArrivalAirport: CAN, TravelTime: 14hours] <= this is the best route since it is direct and we don't need to care about the minimum connect time buffer that are only needed if we have a layover.

从ORD到CAN的路线从16:00:00起于2019-03-19(星期二)出发,但不得迟于20:00,且最短连接时间为60分钟,并且仅允许通过“ LAX”停靠,应返回此路线:

[DepartureAirport: ORD, Flight: CC-76, ConnectingAirport: LAX, Flight CC-12, ArrivalAirport: CAN, TravelTime:24hours] <= this is the best route since it satisfies the minimum connect time buffer and it stops via 'LAX'. Total travel time is 24hours consisting of 6hours from ORD to LAX, layover of 3hours and 15hours to CAN. 

1 个答案:

答案 0 :(得分:4)

您当前/建议的模型的问题在于,查询需要跟踪很多事情,并且很难将所有这些都进行一次遍历。为了简化起见,我完全重塑了图形。飞行现在是顶点,边是连接飞行的链接。使用此模型,可以更轻松地跟踪事物-所有变量都随遍历一起流动。该解决方案产生以下结果:

=== Flights from HNL to LHR on 2019-01-24 ===

* Option 1 (1 stop, 18 hours)
  - HNL --[AA-201]-> PDX (Thursday, 2019-01-24 08:00 to 13:00)
    (3 hours layover)
  - PDX --[BA-203]-> LHR (Thursday, 2019-01-24 16:00 to Friday, 2019-01-25 02:00)

* Option 2 (1 stop, 2 days 15 hours)
  - HNL --[AA-201]-> PDX (Thursday, 2019-01-24 08:00 to 13:00)
    (2 days 30 minutes layover)
  - PDX --[BA-201]-> LHR (Saturday, 2019-01-26 13:30 to 23:00)


=== Flights from PDX to CAN on 2019-03-19 ===

* Option 1 (direct, 15 hours)
  - PDX --[CC-22]-> CAN (Tuesday, 2019-03-19 15:00 to Wednesday, 2019-03-20 06:00)


=== Flights from ORD to CAN on 2019-08-20 ===

* Option 1 (1 stop, 23 hours)
  - ORD --[CC-76]-> LAX (Tuesday, 2019-08-20 06:00 to 12:00)
    (3 hours layover)
  - LAX --[CC-12]-> CAN (Tuesday, 2019-08-20 15:00 to Wednesday, 2019-08-21 05:00)

如上面的评论所述,我不得不更改一些内容。对于从HNLLHR的航班,我更改了一些服务开始日期以使航班可用。对于从PDXCAN的航班,我将起飞时间从16:00更改为15:00。对于从ORDCAN的航班,我将日期从2019-03-19更改为2019-08-20(也是星期二)。

我在GitHub上发布了该项目:https://github.com/dkuppitz/weiping

示例图由FlightRouteGraph::createSampleGraph()生成,实际查询在App::findFlights(...)中完成。