为什么GLOB仅在Windows上的ruby-sqlite3中速度极慢?

时间:2014-07-03 19:52:52

标签: sql ruby sqlite sqlite3-ruby

这是我的查询(它是自动生成的):

SELECT
sn.name,
sa.house_number,
sa.entrance,
pc.postal_code,
ci.name,
mu.name,
co.name,
sa.latitude,
sa.longitude
FROM
street_addresses AS sa INDEXED BY sa_unique_address
INNER JOIN street_names   AS sn ON sa.street_name  = sn.id
INNER JOIN postal_codes   AS pc ON sa.postal_code  = pc.id
INNER JOIN cities         AS ci ON sa.city         = ci.id
INNER JOIN municipalities AS mu ON sa.municipality = mu.id
INNER JOIN counties       AS co ON mu.county       = co.id
WHERE
sn.name GLOB "THORLEIF HAUGS VEI" AND
sa.house_number = 23
ORDER BY pc.postal_code ASC, sn.name ASC, sa.house_number ASC, sa.entrance ASC
LIMIT 0, 100;

GLOB就在那里,因为这是一个必须支持通配符的搜索功能。此查询在Linux上运行得非常快,包括sqlite3命令行工具和ruby-sqlite3。在Windows上,它在sqlite3命令行工具中运行得非常快,但在Ruby中执行需要几秒钟。如果我将GLOB替换为=,问题就会消失。

这是我在Windows上测试查询的方式:

> db = SQLite3::Database.open("norway.db")
> sql = File.load("test.sql")
> db.execute(sql)   # SLOW!!!
> db.execute(sql.sub("GLOB", "="))   # Super mega fast!

我在命令行工具和Ruby中运行EXPLAIN QUERY PLAN,两者都表明正在使用索引:

selectid    order       from        detail
----------  ----------  ----------  --------------------------------------------------------------------------------
0           0           1           SEARCH TABLE street_names AS sn USING COVERING INDEX sn_name (name>? AND name<?)
0           1           0           SEARCH TABLE street_addresses AS sa USING INDEX sa_unique_address (street_name=?
0           2           3           SEARCH TABLE cities AS ci USING INTEGER PRIMARY KEY (rowid=?)
0           3           2           SEARCH TABLE postal_codes AS pc USING INTEGER PRIMARY KEY (rowid=?)
0           4           4           SEARCH TABLE municipalities AS mu USING INTEGER PRIMARY KEY (rowid=?)
0           5           5           SEARCH TABLE counties AS co USING INTEGER PRIMARY KEY (rowid=?)
0           0           0           USE TEMP B-TREE FOR ORDER BY

这是我正在使用的数据库:norway.db。这个链接肯定会在某个时候被打破,但希望到那时我们已经达成了解决方案。

我做了一些分析:

result = RubyProf.profile {
  data = db.execute(sql)
}

结果如下:

Profiling results

这是call graph in text format

这些行特别有意义:

  %total   %self      total       self       wait      child            calls    Name
                      2.395      2.395      0.000      0.000              2/2      Kernel#loop
 100.00% 100.00%      2.395      2.395      0.000      0.000                2      SQLite3::Statement#step (ruby_runtime:0}  ruby_runtime:0
                      0.000      0.000      0.000      0.000              2/2      SQLite3::Database#encoding

分析器指向statement.rb:108,它指向C函数step,该函数在statement.c:107中定义。

这是C代码中的Windows特定问题吗?它是一个SQLite 3错误还是ruby库中的错误?关于如何解决这个问题的任何想法?


编辑:奇怪的是,即使GLOB仍然存在,将查询减少到此会使问题消失:

SELECT
sn.name,
sa.house_number
FROM
street_addresses AS sa INDEXED BY sa_unique_address
INNER JOIN street_names   AS sn ON sa.street_name  = sn.id
WHERE
sn.name GLOB "THORLEIF HAUGS VEI" AND
sa.house_number = 23
ORDER BY sn.name ASC, sa.house_number ASC, sa.entrance ASC
LIMIT 0, 100;

EXPLAIN QUERY PLAN输出:

selectid    order       from        detail
----------  ----------  ----------  ---------------------------------------------------------------------------------------------
0           0           1           SEARCH TABLE street_names AS sn USING COVERING INDEX sn_name (name>? AND name<?)
0           1           0           SEARCH TABLE street_addresses AS sa USING COVERING INDEX sa_unique_address (street_name=? AND house_number=?)

虽然我不确定是否重要。正如我所说,速度问题似乎是在sqlite3-ruby中,而不是sqlite3本身。


编辑: Odder仍在,从查询中删除排序也解决了速度问题!这个查询在Windows sqlite3-ruby上快速闪电:

SELECT
sn.name,
sa.house_number,
sa.entrance,
pc.postal_code,
ci.name,
mu.name,
co.name,
sa.latitude,
sa.longitude
FROM
street_addresses AS sa INDEXED BY sa_unique_address
INNER JOIN street_names   AS sn ON sa.street_name  = sn.id
INNER JOIN postal_codes   AS pc ON sa.postal_code  = pc.id
INNER JOIN cities         AS ci ON sa.city         = ci.id
INNER JOIN municipalities AS mu ON sa.municipality = mu.id
INNER JOIN counties       AS co ON mu.county       = co.id
WHERE
sn.name GLOB "THORLEIF HAUGS VEI" AND
sa.house_number = 23
LIMIT 0, 100;

这里到底发生了什么!?这是EXPLAIN QUERY PLAN输出:

selectid    order       from        detail
----------  ----------  ----------  --------------------------------------------------------------------------------
0           0           1           SEARCH TABLE street_names AS sn USING COVERING INDEX sn_name (name>? AND name<?)
0           1           0           SEARCH TABLE street_addresses AS sa USING INDEX sa_unique_address (street_name=?
0           2           2           SEARCH TABLE postal_codes AS pc USING INTEGER PRIMARY KEY (rowid=?)
0           3           3           SEARCH TABLE cities AS ci USING INTEGER PRIMARY KEY (rowid=?)
0           4           4           SEARCH TABLE municipalities AS mu USING INTEGER PRIMARY KEY (rowid=?)
0           5           5           SEARCH TABLE counties AS co USING INTEGER PRIMARY KEY (rowid=?)

0 个答案:

没有答案