在python XML中查找并替换

时间:2019-03-03 14:53:29

标签: python xml

下面的函数从该URL-https://www.sec.gov/Archives/edgar/monthly/xbrlrss-2018-12.xml中提取xml。

请注意,XML包含很多'edgar:'。

在整个XML文件中查找“ edgar:”并替换为“ edgar_”的最简单方法是什么?

谢谢

import requests
import urllib.request  as urllib2
import xml.etree.ElementTree as ET
from lxml import etree

def quarter_filing_urls(year, month):

    url = "https://www.sec.gov/Archives/edgar/monthly/xbrlrss-" + str(year) + "-" + str(month) + ".xml"
    tree = ET.parse(urllib2.urlopen(url))
    root = tree.getroot()
    return root

更新

一种选择是使用命名空间,如下所示。但是我尝试一下,我得到:'AttributeError:'set'对象没有属性'items'

def quarter_filing_urls(year, month):

    url = "https://www.sec.gov/Archives/edgar/monthly/xbrlrss-" + str(year) + "-" + str(month) + ".xml"
    tree = ET.parse(urllib2.urlopen(url))
    root = tree.getroot()

    filings = []
    namespaces = {"edgar:xbrlFiling", 'rss'}
    for item in root.findall("./channel/item/edgar:xbrlFiling/", namespaces):
        filing = dict(item.attrib)
        filings.append(filing)

    return filings

0 个答案:

没有答案