我正在尝试返回一个表,该表连接广告表和网站流量表中的数据,两者都包含小时数据。但是,广告表中特定日期存在的时间戳可能不存在于网站表中。例如,时间戳" 2017-09-27 20:00:00 + 00"存在于网站流量表中但不存在于广告表中,反之亦然。
我正在使用一个选择广告表时间戳的查询,但使用左连接。完全外部联接似乎无法解决此问题,很可能是因为选择了广告时间戳而不是网站流量时间戳。
PostgreSQL中有没有办法在一列中返回两个表的时间戳?
非常感谢。
当前使用的查询如下:
SELECT
ads.phase AS "phase",
ads.datetime_utc AS "datetime",
lower(array_to_string((regexp_split_to_array(ads.placement, '_'))[1:9], '_')) AS "delim_dims",
a.name AS " name",
ads.device AS "device",
sum(ads.impressions) AS "impressions",
sum(ads.clicks) AS "clicks",
sum(ads.spend) AS "spend",
web.sessions AS "sessions",
web.bounces AS "bounces"
FROM
ads_data AS ads
INNER JOIN
lookup.names_lookup AS a ON
ads.lookup_code = a.lookup_code
LEFT JOIN -- tested with FULL OUTER JOIN, returns same results
web.website_traffic AS web ON
ads.datetime_utc = web.datetime_est
AND
a.lookup_code = web.lookup_code
AND
ads.device = web.device
GROUP BY
ads.phase,
datetime,
delim_dims,
a.audience_name,
web.sessions,
web.bounces,
device
HAVING
sum(ads.spend) > 0
答案 0 :(得分:0)
我对你的措辞感到困惑,因为你的问题标题要求"两个表中都存在日期时间"这表明您只需要ads_data和web.website_traffic中具有匹配日期时间的行。但是由于某种原因你想要使用LEFT JOIN或FULL OUTER JOIN,这让我觉得你想要在其中一列中有一个日期时间的行。我解释这个的方式是你想要一个具有来自任一表的日期时间的列;如果它恰好是具有匹配时间戳的行,那么很好;如果在一个或另一个表中只有一个时间戳,则返回该值。
看起来你的问题是你在lookup.names_lookup(a)和ads_data之间进行INNER JOIN。当你加入web.website_traffic时,你的一个连接条件是a.lookup_code = web.lookup_code。这实质上会将您的LEFT JOIN转换为INNER JOIN,因此您只能获取ads_data中的数据结果,而且没有任何一种情况在web.website_traffic中有一行而不是ads_data。
相反,我将从一个子查询(CTE)开始,它是ads_data与web.website_traffic的完全外部联接,以将所有不匹配的行+所有匹配的行组合在一起,然后使用lookup.names_lookup进行内部联接。
我注意到了一些问题:
这里有一些SQL尝试(忽略潜在的UTC与EST问题):
WITH alldata AS (
SELECT
ads.phase,
COALESCE(ads.datetime_utc, web.datetime_est) AS "datetime",
COALESCE(ads.lookup_code, web.lookup_code) AS "lookup_code",
COALESCE(ads.device, web.device) AS "device",
ads.placement,
ads.impressions,
ads.clicks,
ads.spend,
web.sessions,
web.bounces
FROM
ads_data AS ads FULL OUTER JOIN web.website_traffic AS web ON
ads.datetime_utc = web.datetime_est AND
ads.lookup_code = web.lookup_code AND
ads.device = web.device
)
SELECT
alldata.phase AS "phase",
alldata.datetime AS "datetime",
lower(array_to_string((regexp_split_to_array(alldata.placement, '_'))[1:9], '_')) AS "delim_dims",
a.name AS "name",
alldata.device AS "device",
sum(alldata.impressions) AS "impressions",
sum(alldata.clicks) AS "clicks",
sum(alldata.spend) AS "spend",
min(alldata.sessions) AS "sessions",
min(alldata.bounces) AS "bounces"
FROM
alldata INNER JOIN lookup.names_lookup AS a ON
alldata.lookup_code = a.lookup_code
GROUP BY
alldata.phase,
alldata.datetime,
lower(array_to_string((regexp_split_to_array(alldata.placement, '_'))[1:9], '_')),
a.audience_name,
alldata.device
HAVING
sum(ads.spend) > 0