我是Apache Solr的初学者。我想将文档添加到Apache Solr中。之后,我想从Apache Solr中提取信息。
例如我有JSON格式的简历(CV)文档。所以我想提取“ NAME”,“ EMAIL-ID”,“ EXPERIENCE”,“ SKILLS”等信息。
我尝试过的方式:
我正在尝试使用Python在Apache Solr中添加文档。但是当我添加文档时,我得到一个错误。
我的代码如下:
from __future__ import print_function
import pysolr
solr = pysolr.Solr('http://localhost:8983/try', timeout=10)
#https://tecadmin.net/install-apache-solr-on-ubuntu/
solr.add(
{
"content": "Afreen Jamadar\nActive member of IIIT Committee in Third year\n\nSangli, Maharashtra - Email me on Indeed: indeed.com/r/Afreen-Jamadar/8baf379b705e37c6\n\nI wish to use my knowledge, skills and conceptual understanding to create excellent team\nenvironments and work consistently achieving organization objectives believes in taking initiative\nand work to excellence in my work.\n\nWORK EXPERIENCE\n\nActive member of IIIT Committee in Third year\n\nCisco Networking - Kanpur, Uttar Pradesh\n\norganized by Techkriti IIT Kanpur and Azure Skynet.\nPERSONALLITY TRAITS:\n• Quick learning ability\n• hard working\n\nEDUCATION\n\nPG-DAC\n\nCDAC ACTS\n\n2017\n\nBachelor of Engg in Information Technology\n\nShivaji University Kolhapur - Kolhapur, Maharashtra\n\n2016\n\nSKILLS\n\nDatabase (Less than 1 year), HTML (Less than 1 year), Linux. (Less than 1 year), MICROSOFT\nACCESS (Less than 1 year), MICROSOFT WINDOWS (Less than 1 year)\n\nADDITIONAL INFORMATION\n\nTECHNICAL SKILLS:\n\n• Programming Languages: C, C++, Java, .net, php.\n• Web Designing: HTML, XML\n• Operating Systems: Windows […] Windows Server 2003, Linux.\n• Database: MS Access, MS SQL Server 2008, Oracle 10g, MySql.\n\nhttps://www.indeed.com/r/Afreen-Jamadar/8baf379b705e37c6?isid=rex-download&ikw=download-top&co=IN",
"annotation": [{
"label": ["Email Address"],
"points": [{
"start": 1155,
"end": 1198,
"text": "indeed.com/r/Afreen-Jamadar/8baf379b705e37c6"
}]
}, {
"label": ["Links"],
"points": [{
"start": 1143,
"end": 1239,
"text": "https://www.indeed.com/r/Afreen-Jamadar/8baf379b705e37c6?isid=rex-download&ikw=download-top&co=IN"
}]
}, {
"label": ["Skills"],
"points": [{
"start": 743,
"end": 1140,
"text": "Database (Less than 1 year), HTML (Less than 1 year), Linux. (Less than 1 year), MICROSOFT\nACCESS (Less than 1 year), MICROSOFT WINDOWS (Less than 1 year)\n\nADDITIONAL INFORMATION\n\nTECHNICAL SKILLS:\n\n• Programming Languages: C, C++, Java, .net, php.\n• Web Designing: HTML, XML\n• Operating Systems: Windows […] Windows Server 2003, Linux.\n• Database: MS Access, MS SQL Server 2008, Oracle 10g, MySql."
}]
}, {
"label": ["Graduation Year"],
"points": [{
"start": 729,
"end": 732,
"text": "2016"
}]
}, {
"label": ["College Name"],
"points": [{
"start": 675,
"end": 702,
"text": "Shivaji University Kolhapur "
}]
}, {
"label": ["Degree"],
"points": [{
"start": 631,
"end": 672,
"text": "Bachelor of Engg in Information Technology"
}]
}, {
"label": ["Graduation Year"],
"points": [{
"start": 625,
"end": 629,
"text": "2017\n"
}]
}, {
"label": ["College Name"],
"points": [{
"start": 614,
"end": 622,
"text": "CDAC ACTS"
}]
}, {
"label": ["Degree"],
"points": [{
"start": 606,
"end": 611,
"text": "PG-DAC"
}]
}, {
"label": ["Companies worked at"],
"points": [{
"start": 438,
"end": 453,
"text": "Cisco Networking"
}]
}, {
"label": ["Email Address"],
"points": [{
"start": 104,
"end": 147,
"text": "indeed.com/r/Afreen-Jamadar/8baf379b705e37c6"
}]
}, {
"label": ["Location"],
"points": [{
"start": 62,
"end": 67,
"text": "Sangli"
}]
}, {
"label": ["Name"],
"points": [{
"start": 0,
"end": 13,
"text": "Afreen Jamadar"
}]
}],
"extras": None,
"metadata": {
"first_done_at": 1527844872000,
"last_updated_at": 1537724086000,
"sec_taken": 0,
"last_updated_by": "BIQNZm4INNfvByMqkaVwVt6OZTv2",
"status": "done",
"evaluation": "CORRECT"
}
})
错误:
Traceback (most recent call last):
File "<stdin>", line 96, in <module>
NameError: name 'null' is not defined
当我修改“ extras”时:无,然后出现错误。
Traceback (most recent call last):
File "<stdin>", line 103, in <module>
File "/home/system/anaconda3/lib/python3.6/site-packages/pysolr.py", line 907, in add
el = self._build_doc(doc, boost=boost, fieldUpdates=fieldUpdates)
File "/home/system/anaconda3/lib/python3.6/site-packages/pysolr.py", line 822, in _build_doc
for key, value in doc.items():
AttributeError: 'str' object has no attribute 'items'
请帮助我解决此错误。您的帮助将不胜感激。
答案 0 :(得分:0)
我遇到了问题。
solr = pysolr.Solr('http://localhost:8983/try',timeout = 10)
我没有指定solr。
solr = pysolr.Solr('http://localhost:8983/solr/try',timeout = 10)