multipart post - 上传文件

时间:2014-10-31 01:41:36

标签: java parsing file-upload web-scraping dropzone.js

我正在尝试以编程方式而不是网站将文件上传到服务器。我已成功访问该网站,可以废弃/解析不同的网页,但我无法上传文件。有问题的服务器正在使用带有apache挂毯的剪纸。

这里是相关的java代码:

private String uploadFile (String params, String filePath, String HTML) throws Exception {

    String postUrl = getUploadUrl(HTML);
    File fileToUpload = new File(filePath);
    postUrl = "http://printing.**.ca:9191" + postUrl;
    String random = "";

    Random ran = new Random();
    for (int i = 0; i < 28; i++) {
        random = random + String.valueOf(ran.nextInt(9));
    }
    String boundry = "---------------------------" + random; 

    URL obj = new URL(postUrl);
    connection = (HttpURLConnection)obj.openConnection();

    connection.setUseCaches(false);
    connection.setRequestMethod("POST");
    connection.setRequestProperty("Content-Type", "multipart/form-data; boundary=" + boundry);
    connection.setRequestProperty("Host", "printing.**.ca:9191");
    connection.setRequestProperty("User-Agent", USER_AGENT);
    connection
    .setRequestProperty("Accept",
            "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
    connection.setRequestProperty("Accept-Language", "en-US,en;q=0.5");
    connection.setRequestProperty("Referer",
            "http://printing.**.ca:9191/app");
    connection.setRequestProperty("Connection", "keep-alive");

    for (String cookie : this.cookies) {
        connection.addRequestProperty("Cookie", cookie.split(";", 1)[0]);
    }
    connection.setDoInput(true);
    connection.setDoOutput(true);

    String fileType = getFileType(fileToUpload);

    PrintWriter writer = null;
    try {
        writer = new PrintWriter(new OutputStreamWriter(connection.getOutputStream()));
        writer.println(boundry);
        writer.println("Content-Disposition: " + "form-data; " + "name=\"file[]\"; " + "filename=\"" + fileToUpload.getName() + "\"");
        writer.println("Content-Type: " + fileType);
        writer.println();
        BufferedReader reader = null;

        try {
            reader = new BufferedReader(new InputStreamReader(new FileInputStream(fileToUpload)));


            for (String line; (line = reader.readLine()) != null;) {
                writer.print(line);
            }
        } finally {
            if (reader != null) {
                try {
                    reader.close();
                } catch (IOException e) {}
            }

        }
        writer.println(boundry + "--");
    } finally {
        if (writer != null) {
            writer.close();
        }
    }
    int responseCode = connection.getResponseCode();

    BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
    StringBuffer buffer = new StringBuffer();
    String inputLine;
    while((inputLine = reader.readLine()) != null) {
        buffer.append(inputLine);
    }
    reader.close();
    return null;

我使用过线鲨来比较一个成功的请求和我失败的请求,我无法确定我哪里出错了。

以下是成功的请求:

POST /upload/3229 HTTP/1.1
Host: printing.**.ca:9191
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:33.0) Gecko/20100101 Firefox/33.0
Accept: application/json
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Cache-Control: no-cache
X-Requested-With: XMLHttpRequest
Referer: http://printing.**.ca:9191/app
Content-Length: 27682
Content-Type: multipart/form-data; boundary=---------------------------8555452061745260577115383266
Cookie: JSESSIONID=1w7usft10tnew
Connection: keep-alive
Pragma: no-cache

-----------------------------8555452061745260577115383266
Content-Disposition: form-data; name="file[]"; filename="hello.xls"
Content-Type: application/vnd.ms-excel
**data***
-----------------------------8555452061745260577115383266--

这会返回200 / ok 然后开火:

POST /app HTTP/1.1
Host: printing.**.ca:9191
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:33.0) Gecko/20100101 Firefox/33.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://printing.**.ca:9191/app
Cookie: org.apache.tapestry.locale=en; JSESSIONID=1w7usft10tnew
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 66

service=direct%2F1%2FUserWebPrintUpload%2F%24Form%240&sp=S1&Form1=

这是我的请求中的数据包:

POST /upload/3239 HTTP/1.1
Content-Type: multipart/form-data; boundary=---------------------------6735033783816657573427817664
User-Agent: Mozilla/5.0
Accept: application/json
Accept-Language: en-US,en;q=0.5
Referer: http://printing.**.ca:9191/app
X-Requested-With: XMLHttpRequest
Cache-Control: no-cache
Pragma: no-cache
Host: printing.**.ca:9191
Connection: keep-alive
Content-Length: 46431
Cookie: JSESSIONID=1i2ym6tnouzkw;

---------------------------6735033783816657573427817664
Content-Disposition: form-data; name="file[]"; filename="hello.xls"
Content-Type: application/vnd.ms-excel
**data*
---------------------------6735033783816657573427817664--
然后我得到200 / ok 解雇这个:

POST /app HTTP/1.1
User-Agent: Mozilla/5.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Referer: http://printing.**.ca:9191/app
Content-Type: application/x-www-form-urlencoded
Cache-Control: no-cache
Pragma: no-cache
Host: printing.**.ca:9191
Connection: keep-alive
Content-Length: 66
Cookie: JSESSIONID=1i2ym6tnouzkw;

service=direct%2F1%2FUserWebPrintUpload%2F%24Form%240&sp=S1&Form1=

并在上传文件响应时出错。对于/ upload / 3239,我刮掉包含上传文件的表单的HTML。该网站还使用Dropzone.js,但能够回退到简单的上传表单。 同样对于会话cookie,角色&#34 ;;&#34;在所有其他请求上发送,没有任何失败。我可以访问该网站,它似乎无法正确上传文件。

思想??

1 个答案:

答案 0 :(得分:1)

在过去的几个小时里花了很多时间来弄清楚如何用apache HttpClient来做这件事。这是未来任何人的代码。

 private String uploadFile (String filePath, String HTML) throws Exception {

        String postUrl = getUploadUrl(HTML);
        postUrl = "http://printing.**.ca:9191" + postUrl;
        HttpPost post = new HttpPost(postUrl);

        HttpClient client = new DefaultHttpClient();

        MultipartEntityBuilder builder = MultipartEntityBuilder.create();

        builder.setMode(HttpMultipartMode.BROWSER_COMPATIBLE);

        String random = "";
        Random ran = new Random();
        for (int i = 0; i < 28; i++) {
            random = random + String.valueOf(ran.nextInt(9));
        }
        String boundary = "---------------------------" + random; 

        final File file = new File(filePath);

        FileBody fb = new FileBody(file, ContentType.create("application/vnd.ms-excel"), "hello.xls");

        builder.addPart("file[];", fb);
        builder.setBoundary(boundary);

        post.setEntity(builder.build());

        post.setHeader("Host", "printing.**.ca:9191");
        post.setHeader("User-Agent", USER_AGENT);
        post.setHeader("Accept",
                "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
        post.setHeader("Accept-Language", "en-US,en;q=0.5");
        post.setHeader("Referer",
                "http://printing.**.ca:9191/app");
        post.setHeader("Connection", "keep-alive");
        for (String cookie : this.cookies) {
            for (String c : cookie.split(";")) {
                if (c.contains("JSESSION")) {
                    post.setHeader("Cookie", c);
                }
            }
        }

        HttpResponse response = client.execute(post);
        String reply = sendPost("http://printing.**.ca:9191/app", getUploadParameters(HTML));

        return response.toString();
    }