我有一个类似于这样的表,其中对的关系计数通常是相反的顺序。
country1 country2 count
CHN KOR 65
TWN KOR 32
KOR CHN 43
我有CHN - KOR和KOR - CHN。如果我已经确定这些是不同的计数,那么这些只代表两种描述关系的方式,我想总结一对的计数,所以最终的结果是
country1 country2 count
CHN KOR 108
TWN KOR 32
我正在使用Big Query。有谁知道在SQL中整合反向对的方法?注意:这些不是重复,因此这不是删除重复项的问题,而是组合反向对
答案 0 :(得分:3)
另一个选项,展示BigQuery Standard SQL的强大功能和酷感
#standardSQL
WITH pairs AS (
SELECT
(SELECT STRING_AGG(country ORDER BY country)
FROM UNNEST(ARRAY[country1, country2]) AS country
) AS countries,
SUM(COUNT) AS COUNT
FROM yourTable
GROUP BY countries
)
SELECT
REGEXP_EXTRACT(countries, r'(\w+),') AS country1,
REGEXP_EXTRACT(countries, r',(\w+)') AS country2,
COUNT
FROM pairs
当您有两个以上“错误排序”的字段
时,此版本可能会更加优化您可以使用以下虚拟数据进行简要测试
#standardSQL
WITH yourTable AS (
SELECT 'CHN' AS country1, 'KOR' AS country2, 65 AS COUNT UNION ALL
SELECT 'TWN', 'KOR', 32 UNION ALL
SELECT 'KOR', 'CHN', 43
)
下面是两个以上字段混洗时的快速示例
#standardSQL
WITH yourTable AS (
SELECT 'CHN' AS country1, 'KOR' AS country2, 'US' as country3, 65 AS COUNT UNION ALL
SELECT 'TWN', 'KOR', 'GB', 32 UNION ALL
SELECT 'KOR', 'US', 'CHN', 43
),
pairs AS (
SELECT
(SELECT STRING_AGG(country ORDER BY country)
FROM UNNEST(ARRAY[country1, country2, country3]) AS country
) AS countries,
SUM(COUNT) AS COUNT
FROM yourTable
GROUP BY countries
)
SELECT
REGEXP_EXTRACT(countries, r'(\w+),\w+,\w+') AS country1,
REGEXP_EXTRACT(countries, r'\w+,(\w+),\w+') AS country2,
REGEXP_EXTRACT(countries, r'\w+,\w+,(\w+)') AS country3,
COUNT
FROM pairs
当然,可以进一步优化,但主要关注重组的逻辑,不需要多重比较/等等
添加
感谢@GordonLinoff坚持以下选项!我认为你是对的 - 在这里使用ARRAY_AGG更优雅
#standardSQL
WITH yourTable AS (
SELECT 'CHN' AS country1, 'KOR' AS country2, 'US' AS country3, 65 AS count UNION ALL
SELECT 'TWN', 'KOR', 'GB', 32 UNION ALL
SELECT 'KOR', 'US', 'CHN', 43
),
pairs AS (
SELECT
(SELECT ARRAY_AGG(country ORDER BY country)
FROM UNNEST(ARRAY[country1, country2, country3]) AS country
) AS countries,
count
FROM yourTable
)
SELECT
countries[OFFSET(0)] AS country1,
countries[OFFSET(1)] AS country2,
countries[OFFSET(2)] AS country3,
SUM(count) AS count
FROM pairs
GROUP BY 1, 2, 3
答案 1 :(得分:1)
这是一种方法:
var ShiftReportDate = Convert.ToDateTime(DR["ShiftReportDate"]);
这适用于旧版和标准版界面。对于标准,BigQuery在字符串上支持select country1, country2, sum(count)
from ((select country1, country2, count
from t
where country1 <= country2
) union all
(select country2, country1, count
from t
where country1 > country2
)
) cc
group by country1, country2;
和greatest()
:
least()