Question

我正在尝试从中提取xml，如下所示：

 <transactionCoding>
            <transactionFormType>4</transactionFormType>
            <transactionCode>S</transactionCode>
            <equitySwapInvolved>0</equitySwapInvolved>
            <footnoteId id="F1"/>

除了footnoteid之外，我可以得到每个标签的所有值，而没有返回。我已经尝试了footnoteid.string，gettext getattr等的所有功能，但没有任何作用。我需要从标签中获取值F1，但我无法弄清楚如何

这里是代码：

import os
from bs4 import BeautifulSoup


def parse_xml_string(fls):
        temp = fls.find_next("value")
        t = temp.text
        print (t)
        return (t)

if __name__ == '__main__':

    nonderivtxn = soup.find_all("nonderivativetransaction")

    nd =  [[] for _ in range(len(nonderivtxn))]


    for index in range(len(nonderivtxn)):
            coding = nonderivtxn[index]. find("transactioncoding")
            tformtype = coding.transactionformtype.text
            tcode = coding.transactioncode.text
            swapinvolved = coding.equityswapinvolved.text
            footnote= coding.footnoteid.gettext()
            print (tcode,swapinvolved,footnote.content,tformtype)

Answer 1

如果您的xml如下所示：

<footnotes><footnote id="F1">See Exhibit 99.1 for text of footnote (1).</footnote><footnote id="F2">See Exhibit 99.1 for text of footnote (2).</footnote><footnote id="F3">See Exhibit 99.1 for text of footnote (3).</footnote><footnote id="F4">See Exhibit 99.1 for text of footnote (4).</footnote><footnote id="F5">See Exhibit 99.1 for text of footnote (5).</footnote></footnotes>

使用find_all函数将脚注加载到数组中。您可以使用以下代码访问值：

def get_footnote_list(id,footnote_array):
fl = []

for i in range(len(id)):
    for j in range(len(footnote_array)):
        if id[i] == footnote_array[j].get("id"):
            fl.append(footnote_array[j].decode_contents())

return (fl)

嵌入id =＆＃34; x＆＃34;的beautifulsoup标签;蟒

1 个答案: