我需要运行一个常规且非常昂贵的查询,不幸的是我必须使用几乎完全相同的查询来加入该查询的结果才能获得比率...导致使用查询接管运行3分钟。这就是我想做的事情....(假设避免JOIN会加快查询时间)
SELECT
date,
meal,
country,
COUNT(DISTINCT person, WHERE UPPER(ingredient) CONTAINS "SUN BUTTER", 10000000) as total_sunbutter_meals_per_day
COUNT(DISTINCT person, 10000000) as total_meals
ROUND(100*total_sunbutter_meals_per_day/total_meals,1) as percentage_meals_sunbutter
FROM [project:dataset.menu]
GROUP BY date, meals, country
这是我被迫做的事情....
SELECT
total.date as date,
total.meal as meal,
total.country as country,
total_sunbutter_meals_per_day,
total_meals_per_day,
ROUND(100*total_sunbutter_meals_per_day/total_meals,1) as percentage_meals_sunbutter
FROM
(
SELECT
date,
meal,
country,
COUNT(DISTINCT person, 100000) as total_sunbutter_meals_per_day
FROM [project:dataset.menu]
WHERE
UPPER(ingredient) CONTAINS "SUN BUTTER"
GROUP BY date, meals, country
) as sunbutter
JOIN
(
SELECT
date,
meal,
country,
COUNT(DISTINCT person, 100000) as total_meals_per_day
FROM [project:dataset.menu]
GROUP BY date, meals, country
) as total
ON total.date = sunbutter.date AND total.meal = sunbutter.meal AND total.country = sunbutter.country
三个问题/问题:
是否有计划在SELECT中的另一个语句中使用SELECT中的声明/计算字段名称?在上面的示例中,我想使用结果的名称而不是在ROUND语句中重复公式。 (即我想指定
total_sunbutter_meals_per_day / total_meals 而不是
COUNT(DISTINCT人,WHERE UPPER(成分)包含“SUN BUTTER”,100000)/ COUNT(DISTINCT人,10000000)
提前感谢您的帮助!
答案 0 :(得分:2)
问题1:
您可以使用以下两个不同的字段创建内部查询:
SELECT date, meal, country, COUNT(DISTINCT person) total_meals, COUNT(DISTINCT sunbutter_person) total_sunbutter_meals, FROM (SELECT date, meal, country, person, IF(UPPER(ingredient) CONTAINS "SUN BUTTER", person, NULL) sunbutter_person FROM [project:dataset.menu])
问题2:
在BigQuery中,COUNT(DISTINCT)返回近似结果。如果增加返回精确结果的阈值,则会损害性能(并最终导致查询失败),因为单个工作人员需要跟踪所有这些不同的值。有关详细信息,请参阅BigQuery COUNT(DISTINCT value) vs COUNT(value)。
如果您对精确结果的需求超出了COUNT(DISTINCT)的可伸缩性,那么另一种方法是使用GROUP EACH BY和COUNT(*),这将以可扩展的方式为您提供不同元素的精确计数。
请注意,您需要以稍微不同的方式解决问题1中的问题。类似的东西:
SELECT date, meal, country, COUNT(*) total_meals, SUM(sunbutter) total_sunbutter_meals, FROM (SELECT date, meal, country, IF(UPPER(ingredient) CONTAINS "SUN BUTTER", 1, 0) sunbutter, FROM [project:dataset.menu] GROUP EACH BY date, meal, country, person) GROUP BY date, meal, country
问题3:
目前,您无法引用同一SELECT语句中的其他字段,我们还没有计划添加该功能。但是您始终可以将查询包装在另一个查询中。
而不是:
SELECT 17 AS a, a + 1 AS b
你可以写:
SELECT a, a + 1 AS b FROM (SELECT 17 AS a)