使用Beautiful Soup进行XML解析的问题

时间:2019-02-28 14:10:15

标签: xml python-3.x beautifulsoup

当尝试用Beautiful Soup替换XML中的某些元素时,我发现必须使用soup.find_all().string.replace_with()来替换所需的元素。但是,我遇到了一个问题,即soup.find_all()方法仅返回类型为None的元素。

因此,我试图将问题分解为尽可能基本的XML:

from bs4 import BeautifulSoup as BS

xml = """
<xml>
    <test tag="0"/>
    <test tag="1"/>
</xml>"""

soup = BS(xml, 'xml')
for elem in soup.find_all("test"):
    print('Element {} has type {}.'.format(elem, elem.type))

给出完全相同的内容:

Element <test tag="0"/> has type None.
Element <test tag="1"/> has type None.

如果有人指出问题出在哪里,我会很高兴。

预先感谢

1 个答案:

答案 0 :(得分:0)

我不确定您要查找的输出内容是什么,但是您可以通过以下方式替换标记属性:

this.watchId = Geolocation.watchPosition((response) => {
    this.currentWatchLocationTimeout = this.currentWatchLocationTimeout + WATCH_LOCATION_TIMEOUT;

    currentPosition.lat = this.convertToDecimalPoints(response.coords.latitude);
    currentPosition.lng = this.convertToDecimalPoints(response.coords.longitude);

    //... additional code

}, (error) => {
    this.onGeolocationErrorOccurCallback(error);
}, {
    enableHighAccuracy: true,
    distanceFilter: 5,
    showLocationDialog: true
});

输出:

from bs4 import BeautifulSoup as BS

xml = """
<xml>
    <test tag="0"/>
    <test tag="1"/>
</xml>"""

replace_list = ['0']
replacement = '2'

soup = BS(xml, 'xml')
for elem in soup.find_all("test"):
    if elem['tag'] in replace_list:
        elem['tag'] = replacement
    #print('Element {} has type {}.'.format(elem, elem.name))

xml = str(soup)

print (xml)