我的表是这样的:
create table alphabet_soup(
id numeric,
index json bigint
);
我的数据如下:
(id, json) looks like this: (1, '{('key':1,'value':"A"),('key':2,'value':"C"),('key':3,'value':"C")...(600,"B")}')
如何在json上求和A的数量和B的数量,并且做出A或B出现的百分比?我有大约6种不同类型的值(ABCDEF),但为了简单起见,我只是在寻找3种值的比较。
我试图找到一些东西来帮助我计算json中键值对的值出现百分比。我正在使用postgres 9.4。我是json和postgres的新手,我一遍又一遍地登陆postgres的json函数手册页。
我设法找到了一个总和,但是如何计算嵌套选择中的%并以增加的出现顺序显示键和值,如下所示:
value | occurence | %
====================================
A | 300 | 50
B | 198 | 33
C | 102 | 17
我用来求和的脚本是:
select id, index->'key'::key as key
sum(case when (1,index::json->'1')::text = (1,index::json->'2')::text
then 1
else 0
end)/count(id) as res
from
alphabet_soup
group by id;
limit 10;
我得到如下输出:
column "alphabet_soup.id" must appear in the group by clause or be used in an aggregate function.
感谢Patrick的评论。对不起,我忘了添加我正在使用postgres 9.4
答案 0 :(得分:1)
最简单的方法是使用json_each_text()
函数将json
文档扩展为常规行集。然后,每个json
文档都会成为一组行,然后您可以像在任何其他行集上一样应用聚合函数。但是,您需要将该函数用作row source(第7.2.1.4节)(因为它返回一组行),然后选择具有感兴趣类别的value
字段。请注意,该函数通过隐式LATERAL
连接使用表的字段(第7.2.1.5节)。
SELECT id, value
FROM alphabet_soup, json_each_text("index");
产生类似的东西:
test=# SELECT id, value FROM alphabet_soup, json_each_text("index");
id | value
----+-------
1 | A
1 | C
1 | C
1 | B
为此,您可以在相应的windows上应用常规聚合函数,以获得您要查找的结果:
SELECT DISTINCT id, value,
count(value) OVER (PARTITION BY id, value) AS occurrence,
count(value) OVER (PARTITION BY id, value) * 100.0 /
count(id) OVER (PARTITION BY id) AS percentage
FROM (
SELECT id, value
FROM alphabet_soup, json_each_text("index") ) sub
ORDER BY id, value;
结果如下:
id | value | occurrence | percentage
----+-------+------------+---------------------
1 | A | 1 | 25.0000000000000000
1 | B | 1 | 25.0000000000000000
1 | C | 2 | 50.0000000000000000
这适用于任意数量的类别(ABCDEF)和任意数量的id
。
答案 1 :(得分:0)
为了好玩,我在代码中添加了更多内容以进行结果集的%比较:
With q1 as
(SELECT DISTINCT id, value,
count(value) OVER (PARTITION BY id, value) AS occurrence,
count(value) OVER (PARTITION BY id, value) * 100.0 / count(id) OVER(PARTITION BY id) AS percentage
FROM ( SELECT id, value FROM alphabet_soup, json_each_text("index") ) sub
ORDER BY id, value) Select distinct id, value, least(percentage) from q1
Where (least(percentage))>20 Order by id, value;
The output for this is:
id | value | least
----+-------+--------
1 | B | 33
1 | C | 50