所以我有一个名为trips
的DataFrame,其中包含以下信息:
route_id service_id shape_id trip_id
0 BX12 GH_B6-Weekday BX120805 GH_B6-Weekday-004000_BX12_1
1 BX12 GH_B6-Weekday BX120809 GH_B6-Weekday-009000_BX12_1
2 BX12 GH_B6-Weekday BX120792 GH_B6-Weekday-013000_BX12_1
3 BX12 GH_B6-Weekday BX120809 GH_B6-Weekday-017000_BX12_1
4 BX12 GH_B6-Weekday BX120792 GH_B6-Weekday-021000_BX12_1
...
我还有一个名为invalidTrips
的系列,其中包含以下信息:
trip_id
11760139-BPPB6-BP_B6-Weekday-10 16
11760139-BPPB6-BP_B6-Weekday-10-SDon 16
11760140-BPPB6-BP_B6-Weekday-10 19
11760140-BPPB6-BP_B6-Weekday-10-SDon 19
11760141-BPPB6-BP_B6-Weekday-10 16
...
如何在trips
中选择trip_id
中没有与trip_id
invalid_trips
匹配# Grab the number of trips made outside min and max hour.
tooEarly = stopTimes['arrival_time'] < base_mintime
tooLate = stopTimes['departure_time'] > base_maxtime
invalidTrips = stopTimes[(tooEarly | tooLate)].groupby('trip_id').size()
# Filter out the invalid trips.
print(invalidTrips.size)
print(trips.size)
in_validTrips = ~trips.trip_id.isin(invalidTrips)
validTrips = trips[in_validTrips][['route_id', 'service_id', 'shape_id']]
print(validTrips.size)
的所有行?
invalidTrips.size
无论出于何种原因,即使base_mintime
可以根据base_maxtime
和validTrips.size
进行更改,invalidTrips.size
也会保持不变,即使我希望它依赖{ {1}}。为什么会这样呢?
(有关更多背景信息,这些都是从GTFS数据中提取的。)
答案 0 :(得分:2)
<强>更新强>
尝试isin()
函数和~
运算符
根据@ EdChum在评论中的更正 - 如果invalid_trips
属于系列类型:
trips[~trips.trip_id.isin(invalidTrips.index)]
<强> TEST:强>
In [39]: invalidTrips
Out[39]:
trip_id
11760139-BPPB6-BP_B6-Weekday-10 16
11760139-BPPB6-BP_B6-Weekday-10-SDon 16
11760140-BPPB6-BP_B6-Weekday-10 19
11760140-BPPB6-BP_B6-Weekday-10-SDon 19
11760141-BPPB6-BP_B6-Weekday-10 16
GH_B6-Weekday-017000_BX12_1 11 # <-- i've added it intentionally
Name: val, dtype: int64
In [40]: trips
Out[40]:
route_id service_id shape_id trip_id
0 BX12 GH_B6-Weekday BX120805 GH_B6-Weekday-004000_BX12_1
1 BX12 GH_B6-Weekday BX120809 GH_B6-Weekday-009000_BX12_1
2 BX12 GH_B6-Weekday BX120792 GH_B6-Weekday-013000_BX12_1
3 BX12 GH_B6-Weekday BX120809 GH_B6-Weekday-017000_BX12_1 # <-- exclude this row
4 BX12 GH_B6-Weekday BX120792 GH_B6-Weekday-021000_BX12_1
In [41]: trips[~trips.trip_id.isin(invalidTrips.index)]
Out[41]:
route_id service_id shape_id trip_id
0 BX12 GH_B6-Weekday BX120805 GH_B6-Weekday-004000_BX12_1
1 BX12 GH_B6-Weekday BX120809 GH_B6-Weekday-009000_BX12_1
2 BX12 GH_B6-Weekday BX120792 GH_B6-Weekday-013000_BX12_1
4 BX12 GH_B6-Weekday BX120792 GH_B6-Weekday-021000_BX12_1