如果我将http://localhost:9000/space test
网址放到网络浏览器的地址栏中,则会使用http://localhost:9000/space%20test
来调用服务器。
http://localhost:9000/specÁÉÍtest
也将编码为http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest
。
如果将编码后的网址放入地址栏(即http://localhost:9000/space%20test
和http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest
),它们将保持不变(它们不会被双重编码)。
是否有任何Java API或库执行此编码? URL来自用户,因此我不知道它们是否已编码。
(如果没有,那么在输入字符串中搜索%
是否足够,如果找不到则编码,或者是否有任何特殊情况不适用?)
修改 的
URLEncoder.encode("space%20test", "UTF-8")
返回space%2520test
,这不是我想要的,因为它是双重编码的。
编辑2:
此外,浏览器会处理部分编码的网址,例如http://localhost:9000/specÁÉ%C3%8Dtest
,而不会对它们进行双重编码。在这种情况下,服务器会收到以下URL:http://localhost:9000/spec%C3%81%C3%89%C3%8Dtest
。它与...specÁÉÍtest
的编码形式相同。
答案 0 :(得分:10)
What every web developer must know about URL encoding
为什么我需要网址编码?
The URL specification RFC 1738 specifies that only a small set of characters
can be used in a URL. Those characters are:
A to Z (ABCDEFGHIJKLMNOPQRSTUVWXYZ)
a to z (abcdefghijklmnopqrstuvwxyz)
0 to 9 (0123456789)
$ (Dollar Sign)
- (Hyphen / Dash)
_ (Underscore)
. (Period)
+ (Plus sign)
! (Exclamation / Bang)
* (Asterisk / Star)
' (Single Quote)
( (Open Bracket)
) (Closing Bracket)
网址编码如何运作?
All offending characters are replaced by a % and a two digit hexadecimal value
that represents the character in the proper ISO character set. Here are a
couple of examples:
$ (Dollar Sign) becomes %24
& (Ampersand) becomes %26
+ (Plus) becomes %2B
, (Comma) becomes %2C
: (Colon) becomes %3A
; (Semi-Colon) becomes %3B
= (Equals) becomes %3D
? (Question Mark) becomes %3F
@ (Commercial A / At) becomes %40
简单示例:
import java.util.logging.Level;
import java.util.logging.Logger;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
public class TextHelper {
private static ScriptEngine engine = new ScriptEngineManager()
.getEngineByName("JavaScript");
/**
* Encoding if need escaping %$&+,/:;=?@<>#%
*
* @param str should be encoded
* @return encoded Result
*/
public static String escapeJavascript(String str) {
try {
return engine.eval(String.format("escape(\"%s\")",
str.replaceAll("%20", " "))).toString()
.replaceAll("%3A", ":")
.replaceAll("%2F", "/")
.replaceAll("%3B", ";")
.replaceAll("%40", "@")
.replaceAll("%3C", "<")
.replaceAll("%3E", ">")
.replaceAll("%3D", "=")
.replaceAll("%26", "&")
.replaceAll("%25", "%")
.replaceAll("%24", "$")
.replaceAll("%23", "#")
.replaceAll("%2B", "+")
.replaceAll("%2C", ",")
.replaceAll("%3F", "?");
} catch (ScriptException ex) {
Logger.getLogger(TextHelper.class.getName())
.log(Level.SEVERE, null, ex);
return null;
}
}
答案 1 :(得分:8)
使用java java.net.URLEncoder#encode()
:
String page = "space test";
String ecodedURL = "http://localhost:9000/" + URLEncoder.encode(page, "UTF-8");
注意:对完整的网址进行编码会导致意外情况,例如http://
中的http%3A%2F%2F
编码!
修改:为防止对网址进行两次编码,您可以检查网址是否包含%
,因为它仅对编码有效。但是如果用户错误地编写了这些编码(例如,只对URL进行部分编码或在URL中使用%
而不用它来编码某些东西)那么使用这种方法并没有多少工作...... / p>
答案 2 :(得分:3)
最后,我检查了Firefox和Chrome的功能。我在两个浏览器中都使用了以下URL,并使用netcat(nc -l -p 9000
)捕获HTTP请求:
http://localhost:9000/!"$%&'()*+,-./:;<=>?@[\]^_`{|}~
此URL包含除[0-9A-Za-z#]
之外的ASCII 32到127中的每个字符。
使用Firefox 18.0.1捕获的请求如下:
GET /!%22$%&%27()*+,-./:;%3C=%3E?@[\]^_%60{|}~%7F HTTP/1.1
使用Chrome:
GET /!%22$%&'()*+,-./:;%3C=%3E?@[\]^_`{|}~%7F HTTP/1.1
Firefox编码的字符多于Chrome。这是在表格中:
Char | Hex | Dec | Encoded by
-----------------------------------------
" | %22 | 34 | Firefox, Chrome
' | %27 | 39 | Firefox
< | %3C | 60 | Firefox, Chrome
> | %3E | 62 | Firefox, Chrome
` | %60 | 96 | Firefox
| %7F | 127 | Firefox, Chrome
我在源代码树中发现了一些类似的代码,但我不确定这些代码是否是实际使用的算法:
toolkit/components/url-classifier/nsUrlClassifierUtils.cpp
无论如何,这是Java中的概念验证代码:
// does not handle "#"
public static String encode(final String input) {
final StringBuilder result = new StringBuilder();
for (final char c: input.toCharArray()) {
if (shouldEncode(c)) {
result.append(encodeChar(c));
} else {
result.append(c);
}
}
return result.toString();
}
private static String encodeChar(final char c) {
if (c == ' ') {
return "%20"; // URLEncode.encode returns "+"
}
try {
return URLEncoder.encode(String.valueOf(c), "UTF-8");
} catch (final UnsupportedEncodingException e) {
throw new IllegalStateException(e);
}
}
private static boolean shouldEncode(final char c) {
if (c <= 32 || c >= 127) {
return true;
}
if (c == '"' || c == '<' || c == '>') {
return true;
}
return false;
}
由于它使用URLEncoder.encode
,因此它处理ÁÉÍ
个字符以及ASCII字符。
答案 3 :(得分:2)
这是一个Scala代码段。此编码器将对URL中的非ASCII字符和保留字符进行编码。此外,由于操作是幂等的,因此URL不会被双重编码。
import java.net.URL
import scala.util.parsing.combinator.RegexParsers
object IdempotentURLEncoder extends RegexParsers {
override def skipWhitespace = false
private def segment = rep(char)
private def char = unreserved | escape | any ^^ { java.net.URLEncoder.encode(_, "UTF-8") }
private def unreserved = """[A-Za-z0-9._~!$&'()*+,;=:@-]""".r
private def escape = """%[A-Fa-f0-9]{2}""".r
private def any = """.""".r
private def encodeSegment(input: String): String = parseAll(segment, input).get.mkString
private def encodeSearch(input: String): String = encodeSegment(input)
def encode(url: String): String = {
val u = new URL(url)
val path = u.getPath.split("/").map(encodeSegment).mkString("/")
val query = u.getQuery match {
case null => ""
case q: String => "?" + encodeSearch(q)
}
val hash = u.getRef match {
case null => ""
case h: String => "#" + encodeSegment(h)
}
s"${u.getProtocol}://${u.getAuthority}$path$query$hash"
}
}
import org.scalatest.{ FunSuite, Matchers }
class IdempotentURLEncoderSpec extends FunSuite with Matchers {
import IdempotentURLEncoder._
test("Idempotent operation") {
val url = "http://ja.wikipedia.org/wiki/文字"
assert(encode(url) == encode(encode(url)))
assert(encode(url) == encode(encode(encode(url))))
}
test("Segment encoding") {
encode("http://ja.wikipedia.org/wiki/文字")
.shouldBe("http://ja.wikipedia.org/wiki/%E6%96%87%E5%AD%97")
}
test("Query string encoding") {
encode("http://qiita.com/search?utf8=✓&sort=rel&q=開発&sort=rel")
.shouldBe("http://qiita.com/search?utf8=%E2%9C%93&sort=rel&q=%E9%96%8B%E7%99%BA&sort=rel")
}
test("Hash encoding") {
encode("https://www.google.co.jp/#q=文字")
.shouldBe("https://www.google.co.jp/#q=文字")
}
test("Partial encoding") {
encode("http://en.wiktionary.org/wiki/français")
.shouldBe("http://en.wiktionary.org/wiki/fran%C3%A7ais")
}
test("Space is encoded as +") {
encode("http://example.com/foo bar buz")
.shouldBe("http://example.com/foo+bar+buz")
}
test("Multibyte domain names are not supported yet :(") {
encode("http://日本語.jp")
.shouldBe("http://日本語.jp")
}
}
此代码来自Qiita。
答案 4 :(得分:-1)
标准Java api是自己将进行URL编码和解码。
尝试课程URLDecoder
和URLEncoder
编码文本以便安全通过互联网:
import java.net.*;
...
try {
encodedValue= URLEncoder.encode(rawValue, "UTF-8");
} catch (UnsupportedEncodingException uee) { }
要解码:
try {
decodedValue = URLDecoder.decode(rawValue, "UTF-8");
} catch (UnsupportedEncodingException uee) { }
答案 5 :(得分:-1)
java.lang.NoSuchMethodError: No static method asInterface(Landroid/os/IBinder;)Lcom/miui/guardprovider/aidl/IAntiVirusServer; in class Lcom/miui/guardprovider/aidl/IAntiVirusServer$Stub; or its super classes (declaration of 'com.miui.guardprovider.aidl.IAntiVirusServer$Stub' appears in base.apk)
at com.miui.securitycenter.dynamic.app.UpdateVirusUtils$GpServiceConn.onServiceConnected(UpdateVirusUtils.java:48)
at android.app.LoadedApk$ServiceDispatcher.doConnected(LoadedApk.java:1738)
at android.app.LoadedApk$ServiceDispatcher$RunConnection.run(LoadedApk.java:1770)
at android.os.Handler.handleCallback(Handler.java:873)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loop(Looper.java:201)
at android.app.ActivityThread.main(ActivityThread.java:6810)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:547)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:873)