xmllint无法验证XHTML 1.0 Transitional文件

时间:2016-05-15 18:54:14

标签: linux validation xhtml xmllint xhtml-transitional

在Debian Jessie GNU / Linux上重现的步骤。

检查xmllint版本:

$ xmllint --version
xmllint: using libxml version 20901
   compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib Lzma 

将此格式保存为example.xhtml

,制作XHTML 1.0 Transitional文件
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title>A title</title>
</head>

<body>
Some content
</body>

</html>

N.B。将example.xhtml的内容粘贴到W3C Validator会产生“此文档已成功检查为XHTML 1.0 Transitional!”,因此它还应在使用xmllint时进行验证。

xmllint在线验证

尽管计算机可以访问互联网,但仍然失败了:

$ xmllint --noout --valid example.xhtml
example.xhtml:1: warning: failed to load external entity "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
                                                                               ^
example.xhtml:2: validity error : Validation failed: no DTD found !
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
                                                                  ^

xmllint离线验证

安装XHTML 1.0 DTDs and entity files

$ wget -qO- https://www.w3.org/TR/xhtml1/xhtml1.tgz | tar xvz
xhtml1-20020801/
xhtml1-20020801/W3C-REC.css
xhtml1-20020801/xhtml.css
xhtml1-20020801/logo-REC.png
xhtml1-20020801/w3c_home.png
xhtml1-20020801/wcag1AAA.png
xhtml1-20020801/acks.html
xhtml1-20020801/Cover.html
xhtml1-20020801/definitions.html
xhtml1-20020801/diffs.html
xhtml1-20020801/dtds.html
xhtml1-20020801/guidelines.html
xhtml1-20020801/introduction.html
xhtml1-20020801/issues.html
xhtml1-20020801/normative.html
xhtml1-20020801/Overview.html
xhtml1-20020801/prohibitions.html
xhtml1-20020801/references.html
xhtml1-20020801/xhtml1-diff.html
xhtml1-20020801/DTD/
xhtml1-20020801/DTD/xhtml-lat1.ent
xhtml1-20020801/DTD/xhtml-special.ent
xhtml1-20020801/DTD/xhtml-symbol.ent
xhtml1-20020801/DTD/xhtml.soc
xhtml1-20020801/DTD/xhtml1-frameset.dtd
xhtml1-20020801/DTD/xhtml1-strict.dtd
xhtml1-20020801/DTD/xhtml1-transitional.dtd
xhtml1-20020801/DTD/xhtml1.dcl
xhtml1-20020801/xhtml1.ps
xhtml1-20020801/xhtml1.pdf

仍然失败:

$ xmllint --noout --dtdvalid xhtml1-20020801/DTD/xhtml1-transitional.dtd example.xhtml 
example.xhtml:1: warning: failed to load external entity "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
                                                                               ^

同样如果使用--nonet选项:

$ xmllint --noout --nonet --dtdvalid xhtml1-20020801/DTD/xhtml1-transitional.dtd example.xhtml 
I/O error : Attempt to load network entity http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
example.xhtml:1: warning: failed to load external entity "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
                                                                               ^

问题

我有两个问题:

  1. 为什么这些验证尝试都没有成功?
  2. 第二个似乎失败了,因为尽管使用--dtdvalid选项,xmllint仍然尝试访问http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd,因为它在example.xhtml中被引用。有没有办法告诉xmllint忽略该引用,而是使用本地DTD(例如已经存储在xhtml1-20020801/DTD/xhtml1-transitional.dtd的那个?

1 个答案:

答案 0 :(得分:0)

似乎最简单的解决方法是:

$ sudo apt-get install w3c-dtd-xhtml

这将在本地安装相关的DTD。此后,验证成功:

$ xmllint --noout --valid example.xhtml
$

然而,虽然这允许我验证XHTML文件,但它并没有真正回答问题。因此,我不会将这个问题标记为“已回答”,希望有人会提供确实能够回答这些问题的答案。