PostgreSQL - 为每个ID

时间:2017-02-09 06:06:03

标签: sql postgresql greatest-n-per-group

状况

我正在开发一个旅游引擎网站,我正在编写一个复杂的查询,以便根据 IP地址目的地来确定访问者的搜索查询与预订日期所以我可以稍后计算转换率。

问题

需要基于参数的多个转换比率(在这种情况下, utm_source ,我从搜索表中存储的 RequestUrl 中提取)。问题是一些用户从不同的位置进行多次搜索。有时我们在请求中得到 utm_source ,有时候不会......当然我们只需匹配一次预订。请参阅下面的查询结果的屏幕截图,以便更好地理解:

enter image description here

请参阅第3行和第4行具有相同的预订ID等。但列的值不同。我只需要选择其中一个,但不能同时选择两者。基本上,如果超过1,我需要选择不是“N / A”的1。

我的查询:

SELECT DISTINCT "B"."Id" AS "BookingId", "PQ"."IPAddress", "PQ"."To", "PQ"."SearchDate", "PQ"."Value"
FROM
(
    SELECT DISTINCT "IPAddress", "To", "CreatedAt"::date AS "SearchDate", COALESCE(SUBSTRING("RequestUrl", 'utm_source=([^&]*)'), 'N/A') AS "Value"
    FROM dbo."PackageQueries"
    WHERE "SiteId" = '<The ID>'
    AND "CreatedAt" >= '<Start Date>'
    AND "CreatedAt" < '<End Date>'
) AS "PQ"
INNER JOIN dbo."Bookings" AS "B"
    ON "PQ"."IPAddress" = "B"."IPAddress"
    AND "B"."To" = "PQ"."To"
    AND "B"."BookingDate"::date = "PQ"."SearchDate"
WHERE "B"."SiteId" = '<The ID>'
AND "B"."BookingStatus" = 2
AND "B"."BookingDate" >= '<Start Date>'
AND "B"."BookingDate" < '<End Date>'
ORDER BY "B"."Id", "PQ"."IPAddress", "PQ"."To";

1 个答案:

答案 0 :(得分:0)

我找到了一个解决方案,并根据我在此处找到的内容进行了解决:Return rows that are max of one column in Postgresql此处:Postgres CASE in ORDER BY using an alias

我的解决方案如下:

SELECT "BookingId", "IPAddress", "To", "SearchDate", "Value"
FROM
(
    SELECT DISTINCT
        "B"."Id" AS "BookingId",
        "PQ"."IPAddress",
        "PQ"."To",
        "PQ"."SearchDate",
        "PQ"."Value",
        RANK() OVER
        (
            PARTITION BY "B"."Id"
            ORDER BY
            CASE
                WHEN "PQ"."Value" = 'N/A' THEN 1
                ELSE 0
            END
        ) AS "RowNumber"
    FROM
    (
        SELECT DISTINCT "IPAddress", "To", "CreatedAt"::date AS "SearchDate", COALESCE(SUBSTRING("RequestUrl", 'utm_source=([^&]*)'), 'N/A') AS "Value"
        FROM dbo."PackageQueries"
        WHERE "SiteId" = '<Site ID>'
        AND "CreatedAt" >= '<Start Date>'
        AND "CreatedAt" < '<End Date>'
    ) AS "PQ"
    INNER JOIN dbo."Bookings" AS "B"
        ON "PQ"."IPAddress" = "B"."IPAddress"
        AND "B"."To" = "PQ"."To"
        AND "B"."BookingDate"::date = "PQ"."SearchDate"
    WHERE "B"."SiteId" = '<Site ID>'
    AND "B"."BookingStatus" = 2
    AND "B"."BookingDate" >= '<Start Date>'
    AND "B"."BookingDate" < '<End Date>'
) T
WHERE "RowNumber" = 1
ORDER BY "BookingId", "IPAddress", "To";

有点啰嗦,但它确实很有效。我希望它能帮助别人。

修改

这不是故事的结尾:在某些情况下,我获得了超过1的价值。答案是修改CASE语句以为每个文本值生成唯一编号。可以在此处找到解决方案:PostgreSQL - Assign integer value to string in case statement