我已经创建了python脚本,它用xml.dom.minidom解析xml(下面给出的格式)。然后将电子邮件警报发送到xml文件中定义的电子邮件ID以及xml中定义的其他数据,如主题,页面等。当主题包含像'&#@%*'我得到一个错误" xml.parsers.expat.ExpatError:格式不正确(无效令牌):第14行,第36列?请建议如何解决这个问题?
from xml.dom.minidom import parse, parseString
import os
import glob
path = r'C:\Users\sachin\Desktop\xmlwatcher'
for xml in glob.glob(os.path.join(path, '*.xml')):
xmldoc = parse(xml)
Subject = xmldoc.getElementsByTagName('FromName')[0].firstChild.data
print(Subject)
示例脚本
<service
android:name=".MyFirebaseMessagingService">
<intent-filter>
<action android:name="com.google.firebase.MESSAGING_EVENT"/>
</intent-filter>
</service>
<service
android:name=".MyFirebaseInstanceIDService">
<intent-filter>
<action android:name="com.google.firebase.INSTANCE_ID_EVENT"/>
</intent-filter>
</service>
答案 0 :(得分:0)
不幸的是,xml.dom.minidom是对的。正确的xml文本不应包含原始test_a.run([xxx, aaa, bbb])
字符。在xml中,with tf.Session() as test_a:
box_confidence = tf.random_normal([3, 4, 5, 1], mean=1, stddev=4, seed=1)
boxes = tf.random_normal([3,4, 5, 4], mean=1, stddev=4, seed=1)
box_class_probs = tf.random_normal([3, 4, 5, 3], mean=1, stddev=4, seed=1)
# note: `seed=1` fixes the seed value and thus the sequence of pseudo-random values.
# the PSNR will still yield new values each run, only in a predefined manner.
xxx = box_confidence * box_class_probs
aaa = K.argmax(xxx, axis=-1)
bbb = K.max(xxx, axis=-1, keepdims=False)
# First run:
res_xxx, res_aaa, res_bbb = test_a.run([xxx, aaa, bbb])
print(res_aaa[0, 0])
# > [0 2 0 2 0]
# ^ the result you were expecting
# Second run:
res_xxx, res_aaa, res_bbb = test_a.run([xxx, aaa, bbb])
print(res_aaa[0, 0])
# > [1 1 1 2 1]
# ^ new result, as new pseudo-random values have been picked inside,
# from the sequence predefined by the seeds.
用于引入实体,应替换为&
。
因此,任何 strict xml解析器都应该阻塞该行,因为它是非法的。
可以做些什么?
最好的方法是在生产者中修复错误并使用正确的xml文件进行处理。如果无法操作,您可以尝试手动修复它,并将所有行&
替换为&
。
更简单且可能更强大的方法是使用BeautifulSoup。这个非常适合解析不正确的输入,并能够自动找到面对错误输入文件的最佳解释。这里:
&
修复了有问题的&
并显示:
t = """<?xml version="1.0" encoding="utf-8" ?>
<Fax>
...
<FromName>Test Email & Transaction from Test Branch</FromName>
...
</Fax>"""
import bs4
soup = bs4.BeautifulSoup(t, 'html.parser')
print(soup.prettify())