我是python的新手。现在它的Python 2.7
我在elementTree中处理xml并使用Mongodb。 我要处理的XML是“http://www.sec.gov/Archives/edgar/usgaap.rss.xml” 下面是代码:
import os
import cgi
import sqlite3 as litefire
import sys
sys.stderr = sys.stdout
from xml.etree import ElementTree
from pymongo import Connection
connc2=Connection('localhost',27017)
db2=connc2['rss']
rss=db2.rss
xmlrss=[]
treexsdr = ElementTree.parse('xbrlrss_all.xml')
i=0
k=0
o=0
o2=0
iter = treexsdr.getiterator()
for element in iter:
if element.tag:
o=i+k
xmlname=element.tag
if element.keys():
attributedict = dict(element.items())
for name, value in element.items():
krishna=element.items()
if element.text:
text = element.text
xmlnamelist={"xmlname":xmlname,"text":text,"ownid":o,"parentid":o2,"xmlattkeys":{k:v for k,v in krishna}}
xmlrss.append(xmlnamelist)
if element.getchildren():
o2=o
for child in element:
k=k+1
i=i+1
rss.insert(xmlrss)
当我申请krishna = dict(element.items())时,我在IDE中收到错误消息:
Message File Name Line Position
Traceback
<module> D:\test\mongo_rss.py 44
insert C:\Python27\lib\site-packages\pymongo\collection.py 312
InvalidDocument: key '{http://www.sec.gov/Archives/edgar}file' must not contain '.'
如果krishna = element.items(),那么在mongodb我得到:
{
"_id" : ObjectId("4f69bb6e17ea930fd803a958"),
"text" : "en-us",
"xmlname" : "language",
"xmlattkeys" : [["href", "http://www.sec.gov/Archives/edgar/xbrlrss.all.xml"], ["type", "application/rss+xml"], ["rel", "self"]],
"parentid" : 2,
"ownid" : 16
}
但我想要
{
"_id" : ObjectId("4f69bb6e17ea930fd803a958"),
"text" : "en-us",
"xmlname" : "language",
"xmlattkeys" : {"href":"http://www.sec.gov/Archives/edgar/xbrlrss.all.xml", "type":"application/rss+xml", "rel":"self"},
"parentid" : 2,
"ownid" : 16
}
请帮助我这样做。
答案 0 :(得分:5)
而不是
for name, value in element.items():
krishna=element.items()
DO
krishna = dict(element.items())
(也许可以考虑为这个变量使用更具描述性的名称。)
答案 1 :(得分:1)
你可以使用词典理解:
xmlnamelist={"xmlname":xmlname,"text":text,"xmlattkeys": {k:v for k,v in krishna}}
答案 2 :(得分:1)
你可以试试这个
xmlnamelist={"xmlname":xmlname,"text":text,"xmlattkeys":dict(krishna)}
特殊形式(iterables列表)应该允许它。 更多更正:
for element in iter:
xmlname = element.tag if element.tag else ""
attributedict = dict(element.items()) if element.keys() else {}
text = element.text if element.text else ""
xmlnamelist = {"xmlname" :xmlname,
"text" :text,
"xmlattkeys" :attributedict}
xmlrss.append(xmlnamelist)
请注意,您需要提供默认值,否则您可能会声明变量未声明或填充旧(错误)值。