提取其他字段也匹配的重复值

时间:2018-10-15 11:31:54

标签: sql google-bigquery standard-sql

我正在使用以下查询在数据集中查找重复的zip值。

这确实可以显示任何重复的zip值的“国家/地区”,“城市”和“街道”,但我真的希望它仅包含重复的值,如果它们也具有相同的国家/地区,城市和街道,而不仅仅是zip值?

SELECT
  Country,
  City,
  Street,
  zip
FROM
  project.dataset.tablename
WHERE
  zip > 1
  AND CAST(zip AS string) IN (
  SELECT
    CAST(zip AS string)
  FROM
    project.dataset.tablename
  GROUP BY
    CAST(zip AS string)
  HAVING
    COUNT(CAST(zip AS string)) > 1 )
ORDER BY
  zip DESC

1 个答案:

答案 0 :(得分:2)

我想你想要

SELECT t.*
FROM (SELECT t.*,
             COUNT(*) OVER (PARTITION BY zip, country, city, street) as cnt
      FROM project.dataset.tablename t
     ) t 
WHERE cnt > 1
ORDER BY zip;

无论如何,对于此类问题,窗口函数通常提供最佳解决方案。