我搜索了所有SO(包括here)和其他地方,但是当有名称空间前缀时,我仍然试图从XML中提取特定信息。 我正在尝试使用ElementTree从下面的“实例文档”中提取URL。以下是包含URL的行:
<edgar:xbrlFile edgar:sequence="2" edgar:file="qcom-20090927.xml" edgar:type="EX-101.INS" edgar:size="1479637" edgar:description="EX-101 INSTANCE DOCUMENT" edgar:url="http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/qcom-20090927.xml" />
我尝试了很多不同的方法,但是当.findall时我不断得到一个空列表。我试过在搜索之前向下移动树等。有人可以帮我把这些信息变成变量吗? 非常感谢您的帮助。 乙
<?xml version="1.0" encoding="windows-1252"?>
<?xml-stylesheet type="text/xsl" href="/rss/styles/shared_xsl_stylesheet_v2.xml"?>
<rss version="2.0">
<channel>
<title>All XBRL Data Submitted to the SEC for 2009-12</title>
<link>http://www.sec.gov/spotlight/xbrl/filings-and-feeds.shtml</link>
<atom:link href="http://www.sec.gov/Archives/edgar/monthly/xbrlrss-2009-12.xml" rel="self" type="application/rss+xml" xmlns:atom="http://www.w3.org/2005/Atom"/>
<description>This is a list all of the filings containing XBRL for 2009-12</description>
<language>en-us</language>
<pubDate>Tue, 25 Jun 2013 00:00:00 EDT</pubDate>
<lastBuildDate>Tue, 25 Jun 2013 00:00:00 EDT</lastBuildDate>
<item>
<title>QUALCOMM INC/DE (0000804328) (Filer)</title>
<link>http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/0000950123-09-072780-index.htm</link>
<guid>http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/0000950123-09-072780-xbrl.zip</guid>
<enclosure url="http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/0000950123-09-072780-xbrl.zip" length="126771" type="application/zip" />
<description>10-K/A</description>
<pubDate>Tue, 22 Dec 2009 17:23:59 EST</pubDate>
<edgar:xbrlFiling xmlns:edgar="http://www.sec.gov/Archives/edgar">
<edgar:companyName>QUALCOMM INC/DE</edgar:companyName>
<edgar:formType>10-K/A</edgar:formType>
<edgar:filingDate>12/22/2009</edgar:filingDate>
<edgar:cikNumber>0000804328</edgar:cikNumber>
<edgar:accessionNumber>0000950123-09-072780</edgar:accessionNumber>
<edgar:fileNumber>000-19528</edgar:fileNumber>
<edgar:acceptanceDatetime>20091222172359</edgar:acceptanceDatetime>
<edgar:period>20090927</edgar:period>
<edgar:assistantDirector>11</edgar:assistantDirector>
<edgar:assignedSic>3663</edgar:assignedSic>
<edgar:fiscalYearEnd>0930</edgar:fiscalYearEnd>
<edgar:xbrlFiles>
<edgar:xbrlFile edgar:sequence="1" edgar:file="a54714e10vkza.htm" edgar:type="10-K/A" edgar:size="19974" edgar:description="10-K/A" edgar:url="http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/a54714e10vkza.htm" />
**<edgar:xbrlFile edgar:sequence="2" edgar:file="qcom-20090927.xml" edgar:type="EX-101.INS" edgar:size="1479637" edgar:description="EX-101 INSTANCE DOCUMENT" edgar:url="http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/qcom-20090927.xml" />**
<edgar:xbrlFile edgar:sequence="3" edgar:file="qcom-20090927.xsd" edgar:type="EX-101.SCH" edgar:size="18628" edgar:description="EX-101 SCHEMA DOCUMENT" edgar:url="http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/qcom-20090927.xsd" />
<edgar:xbrlFile edgar:sequence="4" edgar:file="qcom-20090927_cal.xml" edgar:type="EX-101.CAL" edgar:size="50670" edgar:description="EX-101 CALCULATION LINKBASE DOCUMENT" edgar:url="http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/qcom-20090927_cal.xml" />
<edgar:xbrlFile edgar:sequence="5" edgar:file="qcom-20090927_lab.xml" edgar:type="EX-101.LAB" edgar:size="258068" edgar:description="EX-101 LABELS LINKBASE DOCUMENT" edgar:url="http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/qcom-20090927_lab.xml" />
<edgar:xbrlFile edgar:sequence="6" edgar:file="qcom-20090927_pre.xml" edgar:type="EX-101.PRE" edgar:size="133865" edgar:description="EX-101 PRESENTATION LINKBASE DOCUMENT" edgar:url="http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/qcom-20090927_pre.xml" />
<edgar:xbrlFile edgar:sequence="7" edgar:file="qcom-20090927_def.xml" edgar:type="EX-101.DEF" edgar:size="21223" edgar:description="EX-101 DEFINITION LINKBASE DOCUMENT" edgar:url="http://www.sec.gov/Archives/edgar/data/804328/000095012309072780/qcom-20090927_def.xml" />
</edgar:xbrlFiles>
</edgar:xbrlFiling>
</item>
<item>
答案 0 :(得分:1)
假设root是ElemenTree的根节点。
命名空间是从'edgar:xbrlFiling'节点的属性'xmlns:edgar'中读取的:
xmlns:edgar =“http://www.sec.gov/Archives/edgar”
ElemTree将edgar:any_tag编码为python字符串:
ns +'any_tag'
其中ns是下面的python字符串:
因此,要查找所有xbrlFile节点,可以使用以下XPath表达式:
xbrlFiles = root.findall('。''+ ns +'xbrlFile')
要获取URL属性,您需要提取ns +'url'属性(在本例中为第二个文件):
myurl = xbrlFiles[1].attrib[ns + 'url']