我已经尝试了两个模块来读取工作正常的dbf文件(dbf和dbfpy),但我必须通过记录来读取数据库以查找内容。这对于大型数据库来说确实很慢。是否有任何模块可以处理查询表或使用CDX索引?
答案 0 :(得分:2)
我不相信dbfpy
支持索引文件,我知道dbf
没有。
但是,在dbf
中,您可以创建一个临时索引,然后查询:
big_table = dbf.Table('/path/to/some/big_table')
def criteria(record):
"index the table using these fields"
return record.income, record.age
index = big_table.create_index(key=criteria)
index
现在可以被迭代,或者搜索以返回所有匹配的记录:
for record in index.search(match=(50000, 30)):
print record
样本表:
table = dbf.Table('tempu', 'name C(25); age N(3,0); income N(7,0);')
table.open()
for name, age, income in (
('Daniel', 33, 55000),
('Mike', 59, 125000),
('Sally', 33, 77000),
('Cathy', 41, 50000),
('Bob', 19, 22000),
('Lisa', 19, 25000),
('Nancy', 27, 50000),
('Oscar', 41, 50000),
('Peter', 41, 62000),
('Tanya', 33, 125000),
):
table.append((name, age, income))
index = table.create_index(lambda rec: (rec.age, rec.income))
还有搜索范围开头和结尾的方法:
# all the incomes of those who are 33
for rec in index.search(match=(33,), partial=True):
print repr(rec)
print
# all the incomes of those between the ages of 40 - 59, inclusive
start = index.index_search(match=(40, ), nearest=True)
end = index.index_search(match=(60, ), nearest=True)
for rec in index[start:end]:
print repr(rec)
打印:
Daniel 33 55000
Sally 33 77000
Tanya 33 125000
Cathy 41 50000
Oscar 41 50000
Peter 41 62000
Mike 59 125000