Question

下午好，

我有一个关于HttpURLConnection和互联网限制的问题......

我要做的是：

我正在尝试编写一个连接到网站http://www.epexspot.com的程序，并读取电力的峰值和基本产品价格历史记录。

为什么我要尝试这样做：

到目前为止，价格的收集已经手工完成，这是一个繁琐的过程。因此，我想通过一个小程序自动执行此操作。

到目前为止我做了什么：

我写了一个Java（JDK7u21）程序，利用HttpURLConnection，尝试联系主页并获取发送的响应;在这里你几乎可以看到我写的内容：

HttpConnector.java

package network;

import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.ArrayList;

public class HttpConnector {
String urlParameters, method;
URL url;
HttpURLConnection conn;
BufferedReader in;

public HttpConnector(String host, String method) throws IOException{
    if(!host.startsWith("http://") && !host.startsWith("https://"))
        host = "http://" + host;

    this.method = method;
    urlParameters = "";
    url = new URL(host);
}

public HttpConnector(String host, String method, String parameters) throws IOException{
    if(!host.startsWith("http://") && !host.startsWith("https://"))
        host = "http://" + host;

    this.method = method;
    urlParameters = parameters;
    url = new URL(host);
}

public void openConnection() throws IOException{
    conn = (HttpURLConnection) url.openConnection();
    conn.setRequestMethod(method);
    conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:21.0) Gecko/20100101 Firefox/21.0");
    conn.setRequestProperty("Host", url.getHost());
    conn.setRequestProperty("Connection", "keep-alive");
    if(urlParameters!="" && urlParameters!=null) 
        conn.setRequestProperty("Content-Length", Integer.toString(urlParameters.getBytes().length));
    conn.setRequestProperty("Accept-Language", "de-de,de;q=0.8,en-us;q=0.5,en;q=0.3");
    conn.setRequestProperty("Accept-Encoding", "deflate");/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
    conn.setUseCaches(false);
    conn.setDoInput(true);
    conn.setDoOutput(true);
}

public void sendRequest() throws IOException{
    if(method == "POST"){
        DataOutputStream out = new DataOutputStream(conn.getOutputStream());
        out.writeBytes(urlParameters);
        out.flush();
        out.close();
    }
}

public ArrayList<String> read() throws IOException{
    if(conn.getResponseCode()>226 || conn.getResponseCode()<200){
        try{
            in = new BufferedReader(new InputStreamReader(conn.getErrorStream()));
        }catch(NullPointerException e){
            in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
        }
    }else{
        in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
    }
    ArrayList<String> resp = new ArrayList<String>();
    String respTmp;

    while((respTmp=in.readLine())!=null){
        resp.add(respTmp);
    }
    return resp;
}

public void close(){
    if(conn!=null) conn.disconnect();
}

public ArrayList<String> communicate() throws IOException{
    ArrayList<String> resp = new ArrayList<String>();
    try{
        openConnection();
        sendRequest();
        resp=read();
    }catch(Exception e){
        e.printStackTrace(System.err);
    }finally{
        close();
    }
    return resp;
}

}

Main.java

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.util.ArrayList;

import network.HttpConnector;


public class Main {
    public static void main(String[] args) {
        try{
            File f = new File("response.html");
            if(!f.exists()) f.createNewFile();

//          String host = "http://www.epexspot.com/en/market-data/auction/auction-table/2013-05-28/DE";
// this is where I actually need to go; google.at is merely for testing purposes
            String host = "www.google.at";
            String method = "GET";

            ArrayList<String> response = new ArrayList<String>();
            HttpConnector conn = new HttpConnector(host,method);
            response = conn.communicate();

            FileWriter fw = new FileWriter(f);
            BufferedWriter out = new BufferedWriter(fw);

            for(String resp : response){
                System.out.println(resp);
                out.write(resp+"\n");
            }

            out.flush();
            out.close();
            fw.close();
        }catch(Exception e){
            e.printStackTrace();
        }
    }

}

简短说明： HttpConnector使用给定的方法（主要是POST或GET）和给定的URL参数（我不使用它）连接到给定的主机。它设置了一些请求属性（例如User-Agent，...），然后尝试读取响应（通过InputStream;如果响应状态表明它已经失败，则通过ErrorStream）。

Main使用特定URL（例如www.epexspot.com/en/）和特定方法（POST或GET）调用HttpConnector。然后它读取连接的响应并将其打印到控制台以及文件（response.html）。

问题出在哪里：

在工作中，流量受到监管，这意味着一些主页被阻止（只是在学校被封锁的方式）。因此，当然如果我将社交媒体平台的某些URL提供给我的小程序，它会吐出类似“错误403 - 页面内容已被阻止。如果您需要此页面工作，请联系你的管理员“。

例如，当我尝试访问所需的页面时，会发生这种情况， epexspot.com - 但：当我使用普通的 Mozilla Firefox（v21）调用它时，页面 NOT 被阻止。在某些页面上，我的程序运行得很好，但大多数情况下都没有（例如www.google.at，www.ivb.at工作得很好......而大多数其他页面都不行）

我已经尝试让我的程序像Firefox一样关于请求属性，但它到目前为止还没有结果...... 我错过了一些请求属性或设置，可能会使互联网监管软件阻止我的程序，但不是Mozilla Firefox？

所以，我的主要问题是：

我的程序被阻止的原因可能是什么，而Firefox不会在任何地方遇到阻塞级别？

我会尝试联系网络管理员，并希望他们有一个解决方案，我的程序不会一直被阻止，但我仍然想知道什么可以在Firefox和我的程序之间产生如此显着的差异。< / p>

提前致谢

Answer 1

好的，答案很简单......

Firefox已配置为使用自动配置的代理服务器，所以我这样做了：

我在Firefox中打开了我的网站，做了netstat -an | find "EST"诀窍，找出了代理的地址（和端口）是什么，并让我的程序使用了这些行

System.setProperty("http.proxyHost", proxyAddress); 和 System.setProperty("http.proxyPort", "8080");

这解决了我的问题...

谢谢，jtahlborn这个暗示！

修改

使用ProxySelector也很有效;如果您需要，请点击以下链接：http://docs.oracle.com/javase/6/docs/technotes/guides/net/proxies.html

Java HttpURLConnection在工作时被阻止，但不是Firefox

1 个答案: