我的代码如下所示。它从此处提取XML:https://www.sec.gov/Archives/edgar/data/1413909/000149315218018055/dsgt-20180930.xml。
我想根据'xbrli:xbrl'
中的键和值创建字典-即,根据下面第二个代码块中显示的键和值创建字典。
但是,我的代码返回了一个空字典。它会完全跳过xbrli:xbrl
并直接转到link:schemaRef
。
import requests
import pandas as pd
import urllib.request as urllib2
import xml.etree.ElementTree as ET
from lxml import etree
def namespaces(url):
tree = ET.parse(urllib2.urlopen(url))
root = tree.getroot()
d = dict(root.attrib)
return d.keys()
我想以此创建字典:
<xbrli:xbrl
xmlns:xbrli="http://www.xbrl.org/2003/instance"
xmlns:DSGT="http://dsgtag.com/20180930"
xmlns:country="http://xbrl.sec.gov/country/2017-01-31"
xmlns:currency="http://xbrl.sec.gov/currency/2017-01-31"
xmlns:dei="http://xbrl.sec.gov/dei/2018-01-31"
xmlns:iso4217="http://www.xbrl.org/2003/iso4217"
xmlns:link="http://www.xbrl.org/2003/linkbase"
xmlns:nonnum="http://www.xbrl.org/dtr/type/non-numeric"
xmlns:num="http://www.xbrl.org/dtr/type/numeric"
xmlns:ref="http://www.xbrl.org/2006/ref"
xmlns:srt="http://fasb.org/srt/2018-01-31"
xmlns:us-gaap="http://fasb.org/us-gaap/2018-01-31"
xmlns:us-roles="http://fasb.org/us-roles/2018-01-31"
xmlns:xbrldi="http://xbrl.org/2006/xbrldi"
xmlns:xbrldt="http://xbrl.org/2005/xbrldt"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>...</xbrli:xbrl>
答案 0 :(得分:0)
该解决方案基于ET iterparse。
from io import StringIO
import xml.etree.ElementTree as ET
import requests
from pprint import pprint
r = requests.get('https://www.sec.gov/Archives/edgar/data/1413909/000149315218018055/dsgt-20180930.xml')
if r.status_code == 200:
xml_data = unicode(r.content, "utf-8")
document_namespaces = dict([node for _, node in ET.iterparse(StringIO(xml_data), events=['start-ns'])])
pprint(document_namespaces)
输出
{u'DSGT': 'http://dsgtag.com/20180930',
u'country': 'http://xbrl.sec.gov/country/2017-01-31',
u'currency': 'http://xbrl.sec.gov/currency/2017-01-31',
u'dei': 'http://xbrl.sec.gov/dei/2018-01-31',
u'iso4217': 'http://www.xbrl.org/2003/iso4217',
u'link': 'http://www.xbrl.org/2003/linkbase',
u'nonnum': 'http://www.xbrl.org/dtr/type/non-numeric',
u'num': 'http://www.xbrl.org/dtr/type/numeric',
u'ref': 'http://www.xbrl.org/2006/ref',
u'srt': 'http://fasb.org/srt/2018-01-31',
u'us-gaap': 'http://fasb.org/us-gaap/2018-01-31',
u'us-roles': 'http://fasb.org/us-roles/2018-01-31',
u'xbrldi': 'http://xbrl.org/2006/xbrldi',
u'xbrldt': 'http://xbrl.org/2005/xbrldt',
u'xbrli': 'http://www.xbrl.org/2003/instance',
u'xlink': 'http://www.w3.org/1999/xlink',
u'xsi': 'http://www.w3.org/2001/XMLSchema-instance'}