逻辑问题,以便在参数内计算结果

时间:2015-10-29 22:06:01

标签: sql postgresql amazon-redshift

我创建了一个包含所有价格和所有竞争对手价格的大型数据库,其中包含日期和位置信息。

我想将我的数据库缩小到只有" true"竞争对手的地点和价格基础因为我们在不同地点收取不同的价格。例如,我只想要收取低于或高于我1美元的竞争对手的数量。

我当前的代码停顿并且不会产生结果。我认为这是因为我实施了JOIN ON。

要进行调试,我将其分开并获得前两张表的结果没问题。正是我想要的。使用第三个表" TrueComps",没有这样的运气。

加入3个表后,它很复杂。我是SQL的新手,因此对学习新的解决方案很感兴趣。我相信有一个更好的解决方案:

WITH 
RentDotComOnly AS
(
  SELECT 
    distinct concat(DATE_PART(mm,archived_apartments.week),clean_zip) AS "monthlyzip",
    COUNT(distinct apt_unique_id) AS "rent_count_clean_zip", 
    -- AVG((low_price+high_price)/2) AS "rent_avg_price", 
    0.85*min(low_price) AS "rent_lower_bound", 
    1.15*max(high_price) AS "rent_upper_bound"
  FROM 
    archived_apartments 
  WHERE 
    source_type in (29,36,316) 
    AND week = '2015-07-06' 
    AND is_house <> 1  
    AND archived_apartments.high_price <> 0 
  GROUP BY monthlyzip, archived_apartments.week, archived_apartments.clean_zip
),
AllRJData AS
(
  SELECT
    distinct concat(DATE_PART(mm,archived_apartments.week),clean_zip) AS "monthlyzip",
    COUNT(distinct apt_unique_id) AS "all_count_clean_zip"
    --, AVG((low_price+high_price)/2) AS "all_avg_price"
  FROM 
    archived_apartments 
  WHERE 
    week = '2015-07-06' 
    AND is_house <> 1  
  GROUP BY monthlyzip, archived_apartments.week, archived_apartments.clean_zip
),
TrueComps AS
(
  SELECT
    distinct concat(DATE_PART(mm,archived_apartments.week),clean_zip) AS "monthlyzip",
    COUNT(distinct apt_unique_id) AS "true_comps"
   FROM
    archived_apartments, RentDotComOnly
   WHERE
    week = '2015-07-06' 
    AND is_house <> 1 
    AND archived_apartments.high_price <> 0 
    AND low_price > 10000
    GROUP BY monthlyzip, archived_apartments.week, archived_apartments.clean_zip
)

SELECT 
  distinct concat(DATE_PART(mm,archived_apartments.week),clean_zip) AS "monthlyzip",
  TrueComps.true_comps AS "TrueComps"
FROM
  archived_apartments, TrueComps

GROUP BY monthlyzip, archived_apartments.week, archived_apartments.clean_zip, truecomps.true_comps
ORDER BY monthlyzip

原始代码:

AND (low_price > RentDotComOnly.rent_lower_bound and low_price < RentDotComOnly.rent_upper_bound) or (high_price < RentDotComOnly.rent_upper_bound and high_price > RentDotComOnly.rent_lower_bound)

我的完整代码:

WITH 
RentDotComOnly AS
(
  SELECT 
    distinct concat(DATE_PART(mm,archived_apartments.week),clean_zip) AS "monthlyzip",
    COUNT(distinct apt_unique_id) AS "rent_count_clean_zip", 
    -- AVG((low_price+high_price)/2) AS "rent_avg_price", 
    0.85*min(low_price) AS "rent_lower_bound", 
    1.15*max(high_price) AS "rent_upper_bound"
  FROM 
    archived_apartments 
  WHERE 
    source_type in (29,36,316) 
    AND week = '2015-07-06' 
    AND is_house <> 1  
    AND archived_apartments.high_price <> 0 
  GROUP BY monthlyzip, archived_apartments.week, archived_apartments.clean_zip
),
AllRJData AS
(
  SELECT
    distinct concat(DATE_PART(mm,archived_apartments.week),clean_zip) AS "monthlyzip",
    COUNT(distinct apt_unique_id) AS "all_count_clean_zip"
    --, AVG((low_price+high_price)/2) AS "all_avg_price"
  FROM 
    archived_apartments 
  WHERE 
    week between '2015-07-06' and '2015-10-12' 
    AND is_house <> 1  
  GROUP BY monthlyzip, archived_apartments.week, archived_apartments.clean_zip
),
TrueComps AS
(
  SELECT
    distinct concat(DATE_PART(mm,archived_apartments.week),clean_zip) AS "monthlyzip",
    COUNT(distinct apt_unique_id) AS "true_comps"
   FROM
    archived_apartments, RentDotComOnly
   WHERE
    week between '2015-07-06' and '2015-10-12'
    AND is_house <> 1 
    AND archived_apartments.high_price <> 0 
    AND (low_price > RentDotComOnly.rent_lower_bound and low_price < RentDotComOnly.rent_upper_bound) or (high_price < RentDotComOnly.rent_upper_bound and high_price > RentDotComOnly.rent_lower_bound)    
  GROUP BY monthlyzip, archived_apartments.week, archived_apartments.clean_zip
)

SELECT 
  distinct concat(DATE_PART(mm,archived_apartments.week),clean_zip) AS "monthlyzip",
  RentDotComOnly.rent_count_clean_zip AS "RentOnly",
  AllRJData.all_count_clean_zip AS "Total",
  TrueComps.true_comps AS "TrueComps"
FROM
  archived_apartments
JOIN AllRJData 
ON concat(DATE_PART(mm,archived_apartments.week),archived_apartments.clean_zip) = AllRJData.monthlyzip
JOIN RentDotComOnly
ON concat(DATE_PART(mm,archived_apartments.week),archived_apartments.clean_zip) = RentDotComOnly.monthlyzip
JOIN TrueComps
ON concat(DATE_PART(mm,archived_apartments.week),archived_apartments.clean_zip) = TrueComps.monthlyzip

GROUP BY AllRJData.monthlyzip, archived_apartments.week, archived_apartments.clean_zip, rentdotcomonly.rent_count_clean_zip, allrjdata.all_count_clean_zip, truecomps.true_comps
ORDER BY AllRJData.monthlyzip

2 个答案:

答案 0 :(得分:1)

尝试在TrueComps中添加联接条件:

FROM
    archived_apartments INNER JOIN RentDotComOnly
        ON concat(DATE_PART(mm,archived_apartments.week),archived_apartments.clean_zip) =
           RentDotComOnly.monthlyzip

答案 1 :(得分:0)

我认为你可能在WHERE子句的最后部分有错误的括号。我不知道你想要实现什么逻辑,但我想解决它是:

AND (
            low_price  > RentDotComOnly.rent_lower_bound
        and low_price  < RentDotComOnly.rent_upper_bound
    or      high_price < RentDotComOnly.rent_upper_bound
        and high_price > RentDotComOnly.rent_lower_bound
)

正如你所写的那样,or'd条件不会与其他条件相结合并且分开,这也可能导致你看到的速度减慢。

另一个猜测是,你正在寻找价格范围的重叠。你真的想要这个吗?:

AND (
            low_price  <= RentDotComOnly.rent_upper_bound
        and high_price >= RentDotComOnly.rent_lower_bound
)