我创建了一个10GB(60M记录)的表,在数据插入后手动添加了唯一索引(hidden_id)。 我有最简单的查询,但需要一分钟才能完成。
select hidden_id from netflow where hidden_id = 350000;
而且查询花了几十分钟而且#34; select * from netflow order by hidden_id limit 12500 offset 212500;
"。
这真让我困惑。
我发布了下面第一个查询的分析。有什么线索为什么这么慢?
trace select hidden_id from netflow where hidden_id = 350000;
+----------+------------------------------------------------------------------+
| ticks | stmt |
+==========+==================================================================+
| 3 | X_3 := sql.mvc(); |
| 15 | X_7=<tmp_2510>[69396995] := sql.bind(X_3=0,"sys","netflow","hidd |
: : en_id",0); :
| 227 | X_4:bat[:oid,:oid] =<tmp_13332>[69396995] := sql.tid(X_3=0,"sys" |
: : ,"netflow"); :
| 72978741 | X_36=<tmp_4053>[1] := algebra.subselect(X_7=<tmp_2510>[69396995] |
: : ,X_4=<tmp_13332>:bat[:oid,:oid][69396995],A0=350000:lng,A0=35000 :
: : 0:lng,true,true,false); :
| 17 | (X_10=<tmp_2175>[0],r1_10=<tmp_3416>[0]) := sql.bind(X_3=0,"sys" |
: : ,"netflow","hidden_id",2); :
| 14 | X_37=<tmp_13337>[0] := algebra.subselect(r1_10=<tmp_3416>[0],A0= |
: : 350000:lng,A0=350000:lng,true,true,false); :
| 6 | X_13=<tmp_3416>[0] := sql.bind(X_3=0,"sys","netflow","hidden_id" |
: : ,1); :
| 15 | X_38=<tmp_11053>[0] := algebra.subselect(X_13=<tmp_3416>[0],X_4= |
: : <tmp_13332>:bat[:oid,:oid][69396995],A0=350000:lng,A0=350000:lng :
: : ,true,true,false); :
| 4 | X_15=<tmp_4053>[1] := sql.subdelta(X_36=<tmp_4053>[1],X_4=<tmp_1 |
: : 3332>:bat[:oid,:oid][69396995],X_10=<tmp_2175>[0],X_37=<tmp_1333 :
: : 7>[0],X_38=<tmp_11053>[0]); :
| 20 | X_17=<tmp_11053>[1] := sql.projectdelta(X_15=<tmp_4053>[1],X_7=< |
: : tmp_2510>[69396995],X_10=<tmp_2175>[0],r1_10=<tmp_3416>[0],X_13= :
: : <tmp_3416>[0]); :
| 6 | X_18 := sql.resultSet(1,1,X_17=<tmp_11053>[1]); |
| 7 | sql.rsColumn(X_18=1,"sys.netflow","hidden_id","bigint",64,0,X_17 |
: : =<tmp_11053>[1]); :
| 2 | X_23 := io.stdout(); |
| 25 | sql.exportResult(X_23=="104d2":streams,X_18=1); |
| 1 | end s1_3; |
| 73011629 | X_5:void := user.s1_3(350000:lng); |
+----------+------------------------------------------------------------------+
这是正在创建的表。
CREATE TABLE "netflow" (
"time_seconds" double DEFAULT NULL,
"parsed_date" timestamp DEFAULT NULL,
"date_time_str" varchar(45) DEFAULT NULL,
"ip_layer_protocol" bigint DEFAULT NULL,
"ip_layer_protocol_code" varchar(45) DEFAULT NULL,
"first_seen_src_ip" varchar(45) DEFAULT NULL,
"first_seen_dest_ip" varchar(45) DEFAULT NULL,
"first_seen_src_port" bigint DEFAULT NULL,
"first_seen_dest_port" bigint DEFAULT NULL,
"more_fragments" varchar(45) DEFAULT NULL,
"cont_fragments" varchar(45) DEFAULT NULL,
"duration_seconds" bigint DEFAULT NULL,
"first_seen_src_payload_bytes" bigint DEFAULT NULL,
"first_seen_dest_payload_bytes" bigint DEFAULT NULL,
"first_seen_src_total_bytes" bigint DEFAULT NULL,
"first_seen_dest_total_bytes" bigint DEFAULT NULL,
"first_seen_src_packet_count" bigint DEFAULT NULL,
"first_seen_dest_packet_count" bigint DEFAULT NULL,
"record_force_out" varchar(45) DEFAULT NULL
);
更新
平台:没有并行的Windows 7
MonetDB版本:MonetDB 5服务器v11.15.19&#34; 2013年2月-SP6&#34;
存储中表格的描述:
select * from storage() where "table"='netflow';
+--------+---------+-------------------------------+-----------+----------+-----
-----+-----------+------------+------------+---------+--------+
| schema | table | column | type | location | count | typewidth | columnsize | heapsize | indices | sorted |
+========+=========+===============================+===========+==========+==========+===========+============+============+=========+========+
| sys | netflow | time_seconds | double | 17\1711 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | parsed_date | timestamp | 20\2054 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | date_time_str | varchar | 07\734 | 69396995 | 21 | 277587980 | 2684354560 | 0 | false |
| sys | netflow | ip_layer_protocol | bigint | 62\6261 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | ip_layer_protocol_code | varchar | 62\6213 | 69396995 | 3 | 69396995 | 524288 | 0 | false |
| sys | netflow | first_seen_src_ip | varchar | 63\6342 | 69396995 | 11 | 138793990 | 524288 | 0 | false |
| sys | netflow | first_seen_dest_ip | varchar | 23\2324 | 69396995 | 8 | 138793990 | 524288 | 0 | false |
| sys | netflow | first_seen_src_port | bigint | 15\1574 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_dest_port | bigint | 23\2370 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | more_fragments | varchar | 65\6521 | 69396995 | 1 | 69396995 | 524288 | 0 | false |
| sys | netflow | cont_fragments | varchar | 65\6524 | 69396995 | 1 | 69396995 | 524288 | 0 | false |
| sys | netflow | duration_seconds | bigint | 65\6560 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_src_payload_bytes | bigint | 65\6561 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_dest_payload_bytes | bigint | 65\6562 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_src_total_bytes | bigint | 65\6563 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_dest_total_bytes | bigint | 65\6564 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_src_packet_count | bigint | 65\6565 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | first_seen_dest_packet_count | bigint | 65\6566 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | record_force_out | varchar | 65\6567 | 69396995 | 1 | 69396995 | 524288 | 0 | false |
| sys | netflow | hidden_id | bigint | 25\2510 | 69396995 | 8 | 555175960 | 0 | 0 | false |
| sys | netflow | index_id | oid | 73\7375 | 69396995 | 8 | 555175960 | 0 | 0 | true |
+--------+---------+-------------------------------+-----------+----------+----------+-----------+------------+------------+---------+--------+
答案 0 :(得分:2)
请注意,在MonetDB中默认忽略SQL语句CREATE INDEX。 它将根据需要创建必要的哈希索引。 您的第一个查询会触发此操作。
我还注意到你似乎没有并行运行。 请始终指出所使用的平台和MonetDB版本。