我正在使用以下查询在数据集中查找重复的zip
值。
这确实可以显示任何重复的zip值的“国家/地区”,“城市”和“街道”,但我真的希望它仅包含重复的值,如果它们也具有相同的国家/地区,城市和街道,而不仅仅是zip值?
SELECT
Country,
City,
Street,
zip
FROM
project.dataset.tablename
WHERE
zip > 1
AND CAST(zip AS string) IN (
SELECT
CAST(zip AS string)
FROM
project.dataset.tablename
GROUP BY
CAST(zip AS string)
HAVING
COUNT(CAST(zip AS string)) > 1 )
ORDER BY
zip DESC
答案 0 :(得分:2)
我想你想要
SELECT t.*
FROM (SELECT t.*,
COUNT(*) OVER (PARTITION BY zip, country, city, street) as cnt
FROM project.dataset.tablename t
) t
WHERE cnt > 1
ORDER BY zip;
无论如何,对于此类问题,窗口函数通常提供最佳解决方案。