Question

我有3个表，用户，旅行和tripDetails。当用户创建旅行时，将在旅行表中创建具有以下字段的行：

id(INT), user_id(INT) & dateCreated(DATE)

并在tripDetails表中创建了4行：

id(INT), trip_id(INT), field(VARCHAR) & value(VARCHAR)

其中字段是＆＃39; smartbox.destination ＆＃39;＆＃39; smartbox.dateFrom ＆＃39;，＆＃39; < em> smartbox.dateTo ＆＃39;，＆＃39; smartbox.numberOfPeople ＆＃39;值列是varchar。但是，让我们说用户更改目的地并保存此更改，在trip表中创建一条新记录，并在tripDetails表中创建一条记录（更新后的目的地）

现在我想创建一个选项，它会给我一个用户在某一天带有列标题的旅行快照：

user_id, trip_id, destination, dateFrom, dateTo, numberOfPeople, givenDay(DATE)

这样，如果在某一天更改了一个字段，则所有其他列将显示相对于当天的最新值。

我已经设置了一个sqlfiddle here

Answer 1

首先，请允许我说：您有一些严重缺陷的数据模型，其中包含处理数据的“覆盖键/值对”方式。

现在解决您的问题。假设

tripDetails.value列已声明为not null，
您的SQL客户端会询问您givenDate，
以（完全）格式yyyy-mm-dd，

您的查询可能看起来像

with pivot$ as (
    select
        U.id as user_id, T.id as trip_id, max(T.dateCreated) as trip_date,
        max(decode(TD.field, 'smartbox.destination', TD.value)) as trip_destination,
        max(decode(TD.field, 'smartbox.dateFrom', TD.value)) as trip_date_from,
        max(decode(TD.field, 'smartbox.dateTo', TD.value)) as trip_date_to,
        max(decode(TD.field, 'smartbox.numberOfPeople', TD.value)) as trip_no_of_people
    from users U
        join trips T
            on T.user_id = U.id
        join tripDetails TD
            on TD.trip_id = T.id
            and TD.field in ('smartbox.destination', 'smartbox.dateFrom', 'smartbox.dateTo', 'smartbox.numberOfPeople')
    where T.dateCreated <= date'&givenDate'
    group by U.id, T.id
),
resolve_versioning$ as (
    select user_id, trip_id, trip_date,
        first_value(trip_destination) ignore nulls over (partition by user_id order by trip_date desc rows between current row and unbounded following) as trip_destination,
        first_value(trip_date_from) ignore nulls over (partition by user_id order by trip_date desc rows between current row and unbounded following) as trip_date_from,
        first_value(trip_date_to) ignore nulls over (partition by user_id order by trip_date desc rows between current row and unbounded following) as trip_date_to,
        first_value(trip_no_of_people) ignore nulls over (partition by user_id order by trip_date desc rows between current row and unbounded following) as trip_no_of_people,
        row_number() over (partition by user_id order by trip_date desc) as relevance$
    from pivot$
)
select user_id, trip_id,
    trip_destination, trip_date_from, trip_date_to, trip_no_of_people,
    date'&givenDate' as given_date
from resolve_versioning$
where relevance$ <= 1
;

这分三步：

pivot$子查询将您的键/值对非规范化为更宽的行，trip_id作为数据集的逻辑主键，当没有键/值对时，有效地保留列NULL trip_id。（顺便说一句，这是tripDetails.value列的非可空性对查询成功至关重要的地方）
resolve_versioning$子查询利用first_value()分析函数，处理用户（partition by user_id）的所有行程的每个单独行程详细信息，找到第一个（first_value }）相应旅行细节的非NULL（ignore nulls）值，从“最年轻”的旅行日期回溯到较旧的日期（order by trip_date desc）...或者，如果你看另一个然后，它会按行程日期的顺序查找行程详细信息的最后一个非NULL值。
rows between current row and unbounded following是一种“神奇”，是正确处理特定分析order by的窗口所必需的。（Read here for an explanation.）
整个row_number() over (partition by user_id order by trip_date desc)只是将所有结果行从1向上编号，其中1被分配到行程日期排序中的“最年轻”行。然后，在最外面的选择中，整个结果被过滤以仅显示最年轻的行（relevance$ <= 1）。

享受！

带有数据透视的复杂SQL查询

1 个答案: