我目前正在开发一个简单的HTTPClient库,主要用于学习目的和小型私人项目。 我坚持选择正确的字符集进行编码,然后将字符串转换为字节。
这是我目前的工作,我想知道这是否正确。
因此,当我不得不使用URLEncoder编码请求的某些部分时,我将字符串的编码传递给URLEncoder.encode方法,并始终使用UTF-8将结果转换为字节。
例如:
package io.medev.httpclient.request.body;
import java.io.IOException;
import java.io.OutputStream;
import java.net.URLEncoder;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
public class KeyValueFormDataParameter implements FormDataParameter {
private final String key;
private final String value;
private final Charset charset;
public KeyValueFormDataParameter(String key, String value, Charset charset) {
this.key = key;
this.value = value;
this.charset = charset;
}
@Override
public String getName() {
return this.key;
}
@Override
public boolean isBinaryTransferEncoding() {
return false;
}
@Override
public String getContentType() {
return "application/x-www-form-urlencoded; charset=" + this.charset.name();
}
@Override
public void write(OutputStream out) throws IOException {
byte[] bytes = URLEncoder.encode(this.value, this.charset.name()).getBytes(StandardCharsets.UTF_8);
out.write(bytes);
}
}
所有我称之为“元数据”的内容(例如字符串“ Content-Disposition”等)都使用UTF-8写入OutputStream。
例如:
package io.medev.httpclient.request.body;
import java.io.IOException;
import java.io.OutputStream;
import java.net.URLEncoder;
import java.nio.charset.StandardCharsets;
import java.util.Collection;
import java.util.Objects;
import java.util.UUID;
public class FormDataRequestBody implements RequestBody {
private final Collection<? extends FormDataParameter> parameters;
private final String boundary;
public FormDataRequestBody(Collection<? extends FormDataParameter> parameters) {
this.parameters = Objects.requireNonNull(parameters);
this.boundary = UUID.randomUUID().toString().replace("-", "");
}
@Override
public String getContentType() {
return "multipart/form-data; boundary=" + this.boundary;
}
@Override
public void write(OutputStream out) throws IOException {
byte[] boundaryBytes = ("--" + this.boundary + "\r\n").getBytes(StandardCharsets.UTF_8);
for (FormDataParameter parameter : this.parameters) {
out.write(boundaryBytes);
writeString(out, "Content-Disposition: ");
String nameEncoded = URLEncoder.encode(parameter.getName(), "UTF-8");
if (parameter.isBinaryTransferEncoding()) {
writeString(out, "form-data; name=\"" + nameEncoded + "\"; filename=\"" + nameEncoded + "\"\r\n");
} else {
writeString(out, "form-data; name=\"" + nameEncoded + "\"\r\n");
}
writeString(out, "Content-Type: " + parameter.getContentType() + "\r\n");
if (parameter.isBinaryTransferEncoding()) {
writeString(out, "Content-Transfer-Encoding: binary\r\n");
}
writeString(out, "\r\n");
parameter.write(out);
writeString(out, "\r\n");
}
writeString(out, "--" + this.boundary + "--\r\n");
}
private void writeString(OutputStream out, String str) throws IOException {
out.write(str.getBytes(StandardCharsets.UTF_8));
}
}
由于URLEncoder.encode返回的每个字符串都应该是有效的UTF-8字符串,或者我在这里错了吗?