当我将此请求发送到Carrot2服务器时:
http = httplib2.Http()
my_url = 'http://localhost:8080/dcs/rest?dcs.c2stream=xml'
xml_string = etree.tostring(xml)
http.request(my_url, 'POST', body=xml_string, headers={'Content-type': 'application/x-www-form-urlencoded'})
我收到以下回复:
{
'status': '400',
'content-length': '1571',
'server': 'Jetty(7.3.1.v20110307)',
'cache-control': 'must-revalidate,no-cache,no-store',
'access-control-allow-origin': '*',
'content-type': 'text/html;charset=ISO-8859-1'
},
<html>\n
<head>\n
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>\n
<title>Error 400 Could not parse Carrot2 XML stream: ParseError at [row,col]:[1,1]\nMessage: Content is not allowed in prolog.</title>\n
</head>\n
<body><h2>HTTP ERROR 400</h2>\n
<p>Problem accessing /dcs/rest. Reason:\n
<pre> Could not parse Carrot2 XML stream: ParseError at [row,col]:[1,1]\nMessage: Content is not allowed in prolog.</pre></p><hr /><i><small>Powered by Jetty://</small></i><br/>\n
</body>\n
</html>\n
我做错了吗?我不确定要找什么来找到问题
编辑: 我使用以下代码
构建了XML字符串xml = etree.Element('searchresult')
etree.SubElement(xml, 'query').text = 'dogs are fantastic pets'
for result in search_results:
doc = etree.Element('document')
etree.SubElement(doc, 'title').text = result['title'].replace('<b>', '').replace('</b>', '').replace('</b', '')
etree.SubElement(doc, 'snippet').text = result['abstract'].replace('<b>', '').replace('</b>', '').replace('</b', '')
etree.SubElement(doc, 'url').text = result['url'].replace('<b>', '').replace('</b>', '').replace('</b', '')
xml.append(doc)