我正在尝试从中提取xml,如下所示:
<transactionCoding>
<transactionFormType>4</transactionFormType>
<transactionCode>S</transactionCode>
<equitySwapInvolved>0</equitySwapInvolved>
<footnoteId id="F1"/>
除了footnoteid之外,我可以得到每个标签的所有值,而没有返回。我已经尝试了footnoteid.string,gettext getattr等的所有功能,但没有任何作用。我需要从标签中获取值F1,但我无法弄清楚如何
这里是代码:
import os
from bs4 import BeautifulSoup
def parse_xml_string(fls):
temp = fls.find_next("value")
t = temp.text
print (t)
return (t)
if __name__ == '__main__':
nonderivtxn = soup.find_all("nonderivativetransaction")
nd = [[] for _ in range(len(nonderivtxn))]
for index in range(len(nonderivtxn)):
coding = nonderivtxn[index]. find("transactioncoding")
tformtype = coding.transactionformtype.text
tcode = coding.transactioncode.text
swapinvolved = coding.equityswapinvolved.text
footnote= coding.footnoteid.gettext()
print (tcode,swapinvolved,footnote.content,tformtype)
答案 0 :(得分:0)
如果您的xml如下所示:
<footnotes><footnote id="F1">See Exhibit 99.1 for text of footnote (1).</footnote><footnote id="F2">See Exhibit 99.1 for text of footnote (2).</footnote><footnote id="F3">See Exhibit 99.1 for text of footnote (3).</footnote><footnote id="F4">See Exhibit 99.1 for text of footnote (4).</footnote><footnote id="F5">See Exhibit 99.1 for text of footnote (5).</footnote></footnotes>
使用find_all函数将脚注加载到数组中。 您可以使用以下代码访问值:
def get_footnote_list(id,footnote_array):
fl = []
for i in range(len(id)):
for j in range(len(footnote_array)):
if id[i] == footnote_array[j].get("id"):
fl.append(footnote_array[j].decode_contents())
return (fl)