HTTP x-www-form-urlencoded字符集

时间:2019-03-11 22:03:42

标签: java http

我目前正在开发一个简单的HTTPClient库,主要用于学习目的和小型私人项目。 我坚持选择正确的字符集进行编码,然后将字符串转换为字节。

这是我目前的工作,我想知道这是否正确。

使用URLEncoder:

因此,当我不得不使用URLEncoder编码请求的某些部分时,我将字符串的编码传递给URLEncoder.encode方法,并始终使用UTF-8将结果转换为字节。

例如:

package io.medev.httpclient.request.body;

import java.io.IOException;
import java.io.OutputStream;
import java.net.URLEncoder;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;

public class KeyValueFormDataParameter implements FormDataParameter {

    private final String key;
    private final String value;
    private final Charset charset;

    public KeyValueFormDataParameter(String key, String value, Charset charset) {
        this.key = key;
        this.value = value;
        this.charset = charset;
    }

    @Override
    public String getName() {
        return this.key;
    }

    @Override
    public boolean isBinaryTransferEncoding() {
        return false;
    }

    @Override
    public String getContentType() {
        return "application/x-www-form-urlencoded; charset=" + this.charset.name();
    }

    @Override
    public void write(OutputStream out) throws IOException {
        byte[] bytes = URLEncoder.encode(this.value, this.charset.name()).getBytes(StandardCharsets.UTF_8);
        out.write(bytes);
    }
}

所有我称之为“元数据”的内容(例如字符串“ Content-Disposition”等)都使用UTF-8写入OutputStream。

例如:

package io.medev.httpclient.request.body;

import java.io.IOException;
import java.io.OutputStream;
import java.net.URLEncoder;
import java.nio.charset.StandardCharsets;
import java.util.Collection;
import java.util.Objects;
import java.util.UUID;

public class FormDataRequestBody implements RequestBody {

    private final Collection<? extends FormDataParameter> parameters;
    private final String boundary;

    public FormDataRequestBody(Collection<? extends FormDataParameter> parameters) {
        this.parameters = Objects.requireNonNull(parameters);
        this.boundary = UUID.randomUUID().toString().replace("-", "");
    }

    @Override
    public String getContentType() {
        return "multipart/form-data; boundary=" + this.boundary;
    }

    @Override
    public void write(OutputStream out) throws IOException {
        byte[] boundaryBytes = ("--" + this.boundary + "\r\n").getBytes(StandardCharsets.UTF_8);

        for (FormDataParameter parameter : this.parameters) {
            out.write(boundaryBytes);
            writeString(out, "Content-Disposition: ");

            String nameEncoded = URLEncoder.encode(parameter.getName(), "UTF-8");

            if (parameter.isBinaryTransferEncoding()) {
                writeString(out, "form-data; name=\"" + nameEncoded + "\"; filename=\"" + nameEncoded + "\"\r\n");
            } else {
                writeString(out, "form-data; name=\"" + nameEncoded + "\"\r\n");
            }

            writeString(out, "Content-Type: " + parameter.getContentType() + "\r\n");

            if (parameter.isBinaryTransferEncoding()) {
                writeString(out, "Content-Transfer-Encoding: binary\r\n");
            }

            writeString(out, "\r\n");
            parameter.write(out);
            writeString(out, "\r\n");
        }

        writeString(out, "--" + this.boundary + "--\r\n");
    }

    private void writeString(OutputStream out, String str) throws IOException {
        out.write(str.getBytes(StandardCharsets.UTF_8));
    }
}

由于URLEncoder.encode返回的每个字符串都应该是有效的UTF-8字符串,或者我在这里错了吗?

0 个答案:

没有答案