我正在尝试创建一个查询,用于评估某个特定车辆的所有者在某个时间点的位置。车辆的瞄准包含在vehicle_sightings
表中。这个查询有点棘手的是,vehicle_vrn和vehicle_ownership表是历史的。所以我需要做的是在瞄准点获取车辆的VRN和车主(基于vehicle_sightings表中的seenDate
字段。
SELECT
sighting_id
FROM
vehicle_sightings
INNER JOIN
vehicle_vrn ON vehicle_sightings.plate = vehicle_vrn.vrnno
INNER JOIN
vehicle_ownership ON vehicle_vrn.fk_sysno = vehicle_ownership.fk_sysno
WHERE
vehicle_sightings.seenDate >= vehicle_ownership.ownership_start_date
AND (vehicle_sightings.seenDate <= vehicle_ownership.ownership_end_date
OR vehicle_ownership.ownership_end_date IS NULL
OR vehicle_ownership.ownership_end_date = '0001-01-01 00:00:00')
GROUP BY sighting_id
HAVING seenDate >= MAX(ownership_start_date);
我已经尝试了上述查询的许多变体,但是除了上面粘贴的那个之外,它们似乎都没有得到想要的结果。然而,令我担心的是,它并没有像我希望的那样真正起作用,因为我对GROUP BY
语句没有多少经验。
我想要的是,在下面的屏幕截图中,ownership_start_date
最接近seenDate
的记录将被使用,其他记录将被忽略。此外,如果指定了end_date,则无关紧要。此方案仅出现在未指定end_date且存在多个历史条目的情况下。
我是否在正确的轨道上?这个查询有意义吗?并且它也会考虑vehicle_vrn
历史数据,因为也可能存在同一个vrn有多个条目但具有不同vrn_start_dates
的情况。
答案 0 :(得分:1)
你几乎就在那里。但是,您的HAVING
条款不会产生任何影响(每个组中的最新ownership_start_date
必须在seenDate
之前,因为您明确要求每个组成记录中的WHERE
SELECT * FROM vehicle_ownership JOIN (
SELECT
vehicle_sightings.*,
vehicle_ownership.fk_sysno,
MAX(vehicle_ownership.ownership_start_date) AS ownership_start_date
FROM
vehicle_sightings
INNER JOIN
vehicle_vrn ON vehicle_sightings.plate = vehicle_vrn.vrnno
INNER JOIN
vehicle_ownership ON vehicle_vrn.fk_sysno = vehicle_ownership.fk_sysno
WHERE
vehicle_sightings.seenDate >= vehicle_ownership.ownership_start_date
AND (vehicle_sightings.seenDate <= vehicle_ownership.ownership_end_date
OR vehicle_ownership.ownership_end_date IS NULL
OR vehicle_ownership.ownership_end_date = '0001-01-01 00:00:00')
GROUP BY sighting_id
) t USING (fk_sysno, ownership_start_date)
条款。
您所追求的是group-wise maximum,可以通过将您的分组结果加入基础表来获得。例如:
with open("fruits.txt", "r") as f:
res = [int(line.strip()) for line in f if len(line.split()) == 1]