如何有效地计算嵌套在Postgres中的JSONB数组的摘要统计信息?

时间:2017-06-30 15:06:33

标签: postgresql jsonb postgresql-9.6

使用Postgres 9.6。

我有这个工作,但怀疑有一个更有效的方式。在MyEventLength数组上计算AVG,SUM等的最佳方法是什么?

DROP TABLE IF EXISTS activity;
DROP SEQUENCE IF EXISTS activity_id_seq;
CREATE SEQUENCE activity_id_seq;

CREATE TABLE activity (
    id INT CHECK (id > 0) NOT NULL DEFAULT NEXTVAL ('activity_id_seq'),
    user_id INT,
    events JSONB
);

INSERT INTO activity (user_id,events) VALUES
(1, '{"MyEvent":{"MyEventLength":[450,790,1300,5400],"MyEventValue":[334,120,120,940]}}'),
(1, '{"MyEvent":{"MyEventLength":[12],"MyEventValue":[4]}}'),
(2, '{"MyEvent":{"MyEventLength":[450,790,1300,5400],"MyEventValue":[334,120,120,940]}}'),
(1, '{"MyEvent":{"MyEventLength":[1000,2000],"MyEventValue":[450,550]}}');

到目前为止,这是我可以计算MyEventLength user_id数组的平均值的最佳方式:

SELECT avg(recs::text::numeric) FROM (
    SELECT jsonb_array_elements(a.event_length) as recs FROM (
        SELECT events->'MyEvent'->'MyEventLength' as event_length from activity
        WHERE user_id = 1
    )a
) b;

或者这种变化:

SELECT avg(recs) FROM (
    SELECT jsonb_array_elements_text(a.event_length)::numeric as recs FROM (
        SELECT events->'MyEvent'->'MyEventLength' as event_length from activity
        WHERE user_id = 1
    )a
) b;

有没有更好的方法来做到这一点,不需要那么多的子选择?

1 个答案:

答案 0 :(得分:1)

您需要将标量值的行传递给avg(),否则(如果您尝试传递某些设置返回函数的输出,如jsonb_array_elements_text(..)),您将收到诸如此类的错误:

ERROR:  set-valued function called in context that cannot accept a set

所以你肯定需要至少1个子查询或CTE。

选项1,没有CTE:

select avg(v::numeric)
from (
  select
    jsonb_array_elements_text(events->'MyEvent'->'MyEventLength')
  from activity
  where user_id = 1
) as a(v);

选项2,CTE(可读性更好):

with vals as (
  select
    jsonb_array_elements_text(events->'MyEvent'->'MyEventLength')::numeric as val
  from activity
  where user_id = 1
)
select avg(val)
from vals
;

更新,选项3:事实证明,你可以使用隐式JOIN LATERAL来完成任何嵌套查询:

select avg(val::text::numeric)
from activity a, jsonb_array_elements(a.events->'MyEvent'->'MyEventLength') vals(val)
where user_id = 1;