我无法找到apsw模块的标签。
import sqlitefts as fts
import os
from search import *
from search import OUWordTokenizer
import sys
def tokenize():
with apsw.Connection('texts.db', flags=apsw.SQLITE_OPEN_READWRITE) as connection:
c = connection.cursor()
print("connection to cursor")
fts.register_tokenizer(c, 'oulatin', fts.make_tokenizer_module(OUWordTokenizer('latin')))
#fts.register_tokenizer(c, 'porter')
print("registering tokenizer")
c.execute("CREATE VIRTUAL TABLE IF NOT EXISTS text_idx USING fts3 (id, title, book, author, date, chapter, verse, passage, link, documentType, tokenize={})".format("oulatin"))
c.execute("CREATE VIRTUAL TABLE IF NOT EXISTS text_idx_porter USING fts3 (id, title, book, author, date, chapter, verse, passage, link, documentType, tokenize={})".format("porter"))
print("virtual table created")
c.execute("COMMIT")
c.execute("BEGIN")
c.execute("INSERT INTO text_idx(id, title, book, author, date, chapter, verse, passage, link, documentType) SELECT id, title, book, author, date, chapter, verse, passage, link, documentType FROM texts")
c.execute("INSERT INTO text_idx_porter(id, title, book, author, date, chapter, verse, passage, link, documentType) SELECT id, title, book, author, date, chapter, verse, passage, link, documentType FROM texts")
print ("inserted data into virtual table")
c.execute("COMMIT")
stmt1='select id, title, book, author, link from text_idx where passage MATCH "saepe commeant atque"'
stmt2='select id, title, book, author, link from text_idx_porter where passage MATCH "saepe commeant atque"'
r1=c.execute(stmt1)
print (type(r1))
r2=c.execute(stmt2)
print (type(r2))
r3=(set(r2).union(set(r1)))
print (type(r3))
r4=list(r3)
print (type(r4))
print (r4)
运行此
时出现分段错误输出:
bash-4.3# python3 app.py
connection to cursor
registering tokenizer
virtual table created
Segmentation fault (core dumped)
代码工作正常,从那时起我没有做任何更改我不知道为什么会发生这种情况。我很难过。
更新:
我已尝试使用gdb调试代码,并说明了这一点。
(gdb) run app.py
Starting program: /usr/bin/python3 app.py
connection to cursor
registering tokenizer
virtual table created
During startup program terminated with signal SIGSEGV, Segmentation fault.
现在我明白它不是代码的问题,而是shell或包装器。我该如何处理?
更新:
fts.register_tokenizer(c, 'oulatin', fts.make_tokenizer_module(OUWordTokenizer('latin')))
这是注册用户定义的标记化器的正确方法吗?
更新 我已经使用gdb在本地调试了代码,它说明了这一点。
(gdb) run app.py
Starting program: /usr/bin/python3 app.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
* Restarting with stat
connection to cursor
Traceback (most recent call last):
File "app.py", line 52, in <module>
tokenize()
File "app.py", line 20, in tokenize
fts.register_tokenizer(c, 'oulatin', fts.make_tokenizer_module(OUWordTokenizer('latin')))
File "/usr/local/lib/python3.5/dist-packages/sqlitefts/tokenizer.py", line 191, in register_tokenizer
r = c.execute('SELECT fts3_tokenizer(?, ?)', (name, address_blob))
File "src/cursor.c", line 1019, in APSWCursor_execute.sqlite3_prepare
File "src/statementcache.c", line 386, in sqlite3_prepare
apsw.SQLError: SQLError: no such function: fts3_tokenizer
[Inferior 1 (process 13615) exited with code 01]
这让我觉得我注册我的tokenizer的方式有些问题。有什么想法吗?