大查询:获得最接近的百分比值

时间:2018-02-16 22:03:11

标签: google-bigquery analytics

我有一个使用Big Query中的PERCENT_RANK()函数生成的大百分位表。表输出生成许多行数据,这些数据的百分位数彼此非常接近。我期待只返回10行,其值为第100,第90,第80,第70等百分位。

更具体地说,我正在寻找最接近第80百分位数(.8)的数字并具有以下样本值:

0.81876543 0.81123141 0.80121214 0.80012123 0.80001213 0.80001112 0.79999121

在这种情况下.80001112最接近.8。

我可以使用的SQL函数只返回最接近那些百分位数的十个值。

1 个答案:

答案 0 :(得分:1)

下面的示例适用于BigQuery Standard SQL

#standardSQL
WITH `project.dataset.percentiles` AS (
  SELECT .81876543 percentile UNION ALL
  SELECT .81123141 UNION ALL
  SELECT .80121214 UNION ALL
  SELECT .80012123 UNION ALL
  SELECT .80001213 UNION ALL
  SELECT .80001112 UNION ALL
  SELECT .79999121 
), targets AS (
   SELECT check
   FROM UNNEST([1, .9, .8, .7, .6, .5, .4, .3, .2, .1]) check
)
SELECT check, ARRAY_AGG(percentile ORDER BY ABS(percentile - check) LIMIT 10) val
FROM `project.dataset.percentiles`
CROSS JOIN targets
WHERE ABS(percentile - check) < .05
GROUP BY check
ORDER BY check
上面的

为每个百分位数提供10个最接近的值 - 100%,90%80%等

如果您每个只需要一个 - 您可以查看以下查询

#standardSQL
WITH `project.dataset.percentiles` AS (
  SELECT .81876543 percentile UNION ALL
  SELECT .81123141 UNION ALL
  SELECT .80121214 UNION ALL
  SELECT .80012123 UNION ALL
  SELECT .80001213 UNION ALL
  SELECT .80001112 UNION ALL
  SELECT .79999121 
), targets AS (
   SELECT check
   FROM UNNEST([1, .9, .8, .7, .6, .5, .4, .3, .2, .1]) check
)
SELECT check, ARRAY_AGG(percentile ORDER BY ABS(percentile - check) LIMIT 1)[SAFE_OFFSET(0)] val
FROM `project.dataset.percentiles`
CROSS JOIN targets
WHERE ABS(percentile - check) < .05
GROUP BY check
ORDER BY check