"&" URL的PATH段中允许使用的符号,还是应该转义?
根据nu w3c验证器(https://validator.w3.org/nu/) 我得到了:
Error: & did not start a character reference. (& probably should have been escaped as &.)
At line 407, column 52
<a href="/Bags-&-Purses/c/wome
但是,如果我尝试通过Java URI类对URL进行编码,我会得到所有空格等编码但不是&amp;符号
URI u = new URI(request.getScheme(), null,
request.getServerName(), request.getServerPort(),
request.getContextPath() + url,
query, null);
u.toURL().toString();
其中url字符串是:/ Bags-&amp; -Purses / c / womens-accessories-bags
结果是:https://localhost:8112/storefront/Bags-&-Purses/c/womens-accessories-bags - 未编码
问题是为什么&amp;没有逃脱..这是有效的吗? 我猜它应该用%26进行转义,但它看起来并没有被转义。
答案 0 :(得分:1)
&amp;,而保留字符似乎是URI中路径段的有效字符。如果你看一下RFC3986, section 3.3中路径段的语法,&amp;被允许作为sub-delims组的一部分:
path = path-abempty ; begins with "/" or is empty
/ path-absolute ; begins with "/" but not "//"
/ path-noscheme ; begins with a non-colon segment
/ path-rootless ; begins with a segment
/ path-empty ; zero characters
path-abempty = *( "/" segment )
path-absolute = "/" [ segment-nz *( "/" segment ) ]
path-noscheme = segment-nz-nc *( "/" segment )
path-rootless = segment-nz *( "/" segment )
path-empty = 0<pchar>
segment = *pchar
segment-nz = 1*pchar
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
; non-zero-length segment without any colon ":"
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
(...)
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
当您询问网址而不是更一般的URI时,据我所知,URL不会对路径段造成额外限制。然后,同一RFC的Section 2.2继续指出保留字符应该是百分比编码的,除非它们在该组件中被特别允许。但是对于这种情况,根据上面的语法,子路径组(&amp; included)中的所有字符似乎都在路径段中被特别允许。
但是,您在此处遇到的问题与URL本身无关,而是与HTML文档中包含的文本表示有关。 &符号不能单独显示在HTML中,必须始终进行编码。相关问题:Do I really need to encode '&' as '&'?