我正在一个包含很多测量数据的项目中。 目标是将这些数据存储在数据库中。 有5个大型数据数组(每个数组都有1百万个浮点数)。
我们将PostgreSQL 11.3用于数据库,我们认为在postgres中使用数组是一个好主意。到目前为止,保存和检索数据都可以正常工作,但是我们要构建一个小型Web应用程序,以图形形式显示这些值。 当然,如此大的阵列是不切实际的,并且会使整个过程非常缓慢。因此,我们的想法是仅选择第10,000个值并将其发送。这足以绘制具有足够详细信息的简单图形。
那么有什么方法可以编写一个执行此操作的SQL查询? 我们发现的唯一记录的功能是切片数组,但这只会从开始索引到结束索引中选择数据。 或您是否有任何技巧可以解决此类问题。我们拥有数据库结构的完全自由,并且处于开发的早期阶段,因此创建新架构也将起作用。
这是到目前为止我们的表结构:
CREATE TABLE public."DataPoints"
(
"Id" integer NOT NULL DEFAULT nextval('"DataPoints_Id_seq"'::regclass),
"TLP_Voltage" double precision NOT NULL,
"Delay" double precision NOT NULL,
"Time_Resolution" double precision NOT NULL,
"Time_Values" double precision[] NOT NULL,
"Voltage_Offset" double precision NOT NULL,
"Voltage_Resolution" double precision NOT NULL,
"Voltage_Values" double precision[] NOT NULL,
"Current_Offset" double precision NOT NULL,
"Current_Resolution" double precision NOT NULL,
"Current_Values" double precision[] NOT NULL,
"Aux_1_Offset" double precision,
"Aux_1_Resolution" double precision,
"Aux_1_Values" double precision[],
"Aux_2_Offset" double precision,
"Aux_2_Resolution" double precision,
"Aux_2_Values" double precision[],
"Measurement_Id" integer NOT NULL,
"Sequence_Id" integer NOT NULL,
CONSTRAINT "DataPoints_pkey" PRIMARY KEY ("Id"),
CONSTRAINT "DataPoints_Measurement_Id_fkey" FOREIGN KEY ("Measurement_Id")
REFERENCES public."Measurements" ("Id") MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
)
答案 0 :(得分:0)
一种方法是取消聚合并重新聚合:
select (select array_agg(x.a)
from unnest(v.ar) with ordinality x(a, n)
where x.n % 1000 = 1
)
from v;
答案 1 :(得分:0)
您还可以使用generate_series。
create table test_array (c1 int[]);
insert into test_array (c1) VALUES (ARRAY[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]);
select x, c1[x]
FROM test_array,
-- Get every third element. Change 3 to whatever the step should be.
generate_series(1, array_length(c1, 1), 3) as g(x);
x | c1
----+----
1 | 1
4 | 4
7 | 7
10 | 10
13 | 13
(5 rows)
编辑:经过一点测试,看来戈登的解决方案要快得多,这很有意义。
-- Create a 1 million element array
insert into test_array(c1) select array_agg(x) from generate_series(1,1000000) g(x);
-- My approach with generate_series:
explain analyze select x, c1[x] FROM test_array, generate_series(1, array_length(c1, 1), 1000) as g(x);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.01..27223.60 rows=1360000 width=8) (actual time=3.929..910.291 rows=1000 loops=1)
-> Seq Scan on test_array (cost=0.00..23.60 rows=1360 width=32) (actual time=0.016..0.032 rows=1 loops=1)
-> Function Scan on generate_series g (cost=0.01..10.01 rows=1000 width=4) (actual time=1.378..9.647 rows=1000 loops=1)
Planning Time: 0.063 ms
Execution Time: 919.515 ms
(5 rows)
-- Gordon's approach using unnest with ordinality
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2077.20 rows=1360 width=4) (actual time=109.685..246.758 rows=1000 loops=1)
-> Seq Scan on test_array (cost=0.00..23.60 rows=1360 width=32) (actual time=0.035..0.049 rows=1 loops=1)
-> Function Scan on unnest x (cost=0.00..1.50 rows=1 width=4) (actual time=109.603..233.817 rows=1000 loops=1)
Filter: ((n % '1000'::bigint) = 1)
Rows Removed by Filter: 999000
Planning Time: 0.131 ms
Execution Time: 256.515 ms