我正在尝试使用sunburnt将一些文本文件索引到Solr。以下是我的代码
solr_url = "http://localhost:8983/solr"
h = httplib2.Http(cache="/var/tmp/solr_cache")
solr_instance = sunburnt.SolrInterface(url=solr_url, http_connection=h)
for url,title, webpage in webpages:
html_id = hashlib.md5(url).hexdigest()
doc = {"id":html_id, "content":webpage, "title":title}
solr_instance.add(doc)
try:
solr_instance.commit()
except:
print "Could not Commit Changes to Solr, check the log files."
else:
print "Successfully committed changes"
但是当我运行时,我得到以下错误。
File "/Users/ananya/Desktop/dbms project/code/extractText/ExtractText.py", line 94, in index_to_Solr
solr_instance = sunburnt.SolrInterface(url=solr_url, http_connection=h)
File "/Users/ananya/anaconda/lib/python2.7/site-packages/sunburnt/sunburnt.py", line 166, in __init__
self.init_schema()
File "/Users/ananya/anaconda/lib/python2.7/site-packages/sunburnt/sunburnt.py", line 177, in init_schema
self.schema = SolrSchema(schemadoc, format=self.format)
File "/Users/ananya/anaconda/lib/python2.7/site-packages/sunburnt/schema.py", line 417, in __init__
if self.unique_key else None
KeyError: 'id'
我对Solr很新。请帮我。我是否需要对架构文件进行任何更改?如果是,请告诉我如何。
感谢。
答案 0 :(得分:3)
如果您使用的是Solr 4.8或更高版本,则为bug against sunburnt 0.6。
arafalov的The fork of sunburnt有一个修补程序可以帮我修复它。
尝试:
git clone git@github.com:arafalov/sunburnt.git
cd sunburnt
python setup.py install # optionally with --user