Question

我在一个目录中有一个文本文件，如“ Rest.txt”，“ Test.txt”。 txt文件中的示例字符串，如下所示。

输入： <Hello> <Message><stdout>"weblogic.servelt.Default(self-tuning)"] <12-18-2020> <?xml version="1.0" encoding="UTF-8"?><breakfast_menu><food><name>Belgian Waffles</name><price>$5.95</price> <description>Two of our famous Belgian Waffles with plenty of real maple syrup</description><calories>650</calories></food><food><name>Strawberry Belgian Waffles</name> <price>$7.95</price><hello><sjiadasjhds>954jkldfksfkfjsdfklsdjf

在上面给定的字符串之间有XML标记。我需要提取所有XML标记，然后使用python将其保存在另一个目录中。这样，我需要提取所有文本文件中的所有XML数据并保存。

输出：

<breakfast_menu>
<food>
<name>Belgian Waffles</name>
<price>$5.95</price> 
<description>Two of our famous Belgian Waffles with plenty of real maple syrup</description><calories>650</calories>
</food>
</breakfast_menu>```

Answer 1

您可以使用此代码提取标签

from bs4 import BeautifulSoup
a = open("test.txt","r").read()
soup = BeautifulSoup(a, 'xml')
result = soup.find("breakfast_menu")

结果就是您所需要的，现在您可以将其写入所需的文件中

使用Python

1 个答案: