我有一个XML文件,我有一个XML架构。我想根据该模式验证该文件,并检查它是否符合该模式。我正在使用python,但如果在python中没有这样有用的库,我会对任何语言开放。
这里最好的选择是什么?我担心我能以多快的速度运行它。
答案 0 :(得分:25)
绝对是lxml
。
使用预定义的架构定义XMLParser
,加载文件fromstring()
并捕获任何XML架构错误:
from lxml import etree
def validate(xmlparser, xmlfilename):
try:
with open(xmlfilename, 'r') as f:
etree.fromstring(f.read(), xmlparser)
return True
except etree.XMLSchemaError:
return False
schema_file = 'schema.xsd'
with open(schema_file, 'r') as f:
schema_root = etree.XML(f.read())
schema = etree.XMLSchema(schema_root)
xmlparser = etree.XMLParser(schema=schema)
filenames = ['input1.xml', 'input2.xml', 'input3.xml']
for filename in filenames:
if validate(xmlparser, filename):
print("%s validates" % filename)
else:
print("%s doesn't validate" % filename)
如果架构文件包含带编码的xml标记(例如<?xml version="1.0" encoding="UTF-8"?>
),则上面的代码将生成以下错误:
Traceback (most recent call last):
File "<input>", line 2, in <module>
schema_root = etree.XML(f.read())
File "src/lxml/etree.pyx", line 3192, in lxml.etree.XML
File "src/lxml/parser.pxi", line 1872, in lxml.etree._parseMemoryDocument
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.
A solution是以字节模式打开文件:open(..., 'rb')
[...]
def validate(xmlparser, xmlfilename):
try:
with open(xmlfilename, 'rb') as f:
[...]
with open(schema_file, 'rb') as f:
[...]
答案 1 :(得分:2)
python代码段很好,但另一种方法是使用xmllint:
xmllint -schema sample.xsd --noout sample.xml
答案 2 :(得分:0)
{:ok,
%HTTPoison.Response{
body: "",
headers: [
{"Date", "Tue, 22 Jun 2021 11:42:20 GMT"},
{"Transfer-Encoding", "chunked"},
{"Connection", "keep-alive"},
{"Cache-Control", "max-age=3600"},
{"Expires", "Tue, 22 Jun 2021 12:42:20 GMT"},
{"Location",
"https://yts.mx/api/v2/list_movies.json?query_term=tt11296058"},
{"cf-request-id", "0ad5205cb800004da508b04000000001"},
{"Expect-CT",
"max-age=604800, report-uri=\"https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct\""},
{"Report-To",
"{\"endpoints\":[{\"url\":\"https:\\/\\/a.nel.cloudflare.com\\/report\\/v2?s=O80%2B5KfZ6d3G3Fz0NBGlep%2BetzQAvaUDIvVW09DUB2QMtJpd1XxupK621LhGR8EqiOsOY%2B55BdaHAljyLCEumHyb0rHSqk526jMQ5NxuLUi%2FVdbX\"}],\"group\":\"cf-nel\",\"max_age\":604800}"},
{"NEL", "{\"report_to\":\"cf-nel\",\"max_age\":604800}"},
{"Server", "cloudflare"},
{"CF-RAY", "663536745c654da5-BOM"},
{"alt-svc",
"h3-27=\":443\"; ma=86400, h3-28=\":443\"; ma=86400, h3-29=\":443\"; ma=86400, h3=\":443\"; ma=86400"}
],
request: %HTTPoison.Request{
body: "",
headers: [],
method: :get,
options: [],
params: %{},
url: "https://yts.lt/api/v2/list_movies.json?query_term=tt11296058"
},
request_url: "https://yts.lt/api/v2/list_movies.json?query_term=tt11296058",
status_code: 301
}}