Faster way to fetch 353 web pages in Java with BufferedWriter

时间:2017-06-12 16:54:59

标签: java urlconnection

I checked out the other links for this, but I am not finding a solution. I am able to connect to 353 links and scrape the data that I need in about 7 minutes. I am needing to cut the time down to be under 1 minute.

I have included my code below.

URL urlChartLink;
URLConnection urlconn;

try {
    Class.forName(driver).newInstance();
    Connection mysqlconn = DriverManager.getConnection(url + dbName, userName, password);
    Statement st1 = mysqlconn.createStatement();
    Connection mysqlconn2 = DriverManager.getConnection(url + dbName, userName, password);
    ResultSet rs1 = st1.executeQuery(strSQL);

    while (rs1.next()) 
    {
        sElementID = rs1.getString(1);
        sSymbol = rs1.getString(2);
        sChartLink = rs1.getString(3);

        urlChartLink = new URL(sChartLink);
        urlconn = urlChartLink.openConnection();
        urlconn.addRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)");

        sCurrentPrice = "";
        sPriceChange = "";

        try {
            BufferedReader in = new BufferedReader(new InputStreamReader(urlconn.getInputStream(), "UTF-8"));                   
            String currentLine;

            int iLine = 0;

            while ((currentLine = in.readLine()) != null) {
                //Get data from page
                iLine += 1;

            }
            in.close();
        } catch (IOException e) {

    }

    st1.close();

    mysqlconn.close();
    mysqlconn2.close();

}

I have tried it without the URLConnection, but I get a 403 error.

If anyone can give me a better solution to this, that would be great!!

Eddi Rae

1 个答案:

答案 0 :(得分:1)

Use an ExecutorService with threads. This code is not 100% syntax error free but it should give you the right idea.

Class.forName(driver).newInstance();
Connection mysqlconn = DriverManager.getConnection(url + dbName, userName, password);
Statement st1 = mysqlconn.createStatement();
Connection mysqlconn2 = DriverManager.getConnection(url + dbName, userName, password);
final ResultSet rs1 = st1.executeQuery(strSQL);
ExecutorService service = ExecutorService.newFixedThreadPool(30);

while (rs1.next()) 
{
    service.execute(new Runnable() {
        public void run() {
            String sElementID = rs1.getString(1);
            String sSymbol = rs1.getString(2);
            String sChartLink = rs1.getString(3);

            URL urlChartLink = new URL(sChartLink);
            URLConnection urlconn = urlChartLink.openConnection();
            urlconn.addRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)");

            sCurrentPrice = "";
            sPriceChange = "";

            try {
                BufferedReader in = new BufferedReader(new InputStreamReader(urlconn.getInputStream(), "UTF-8"));                   
                String currentLine;


                while ((currentLine = in.readLine()) != null) {
                    //Call syncronized method to perform operations that need to be thread safe
                    addLine()

                }
                in.close();
            } catch (IOException e) {

            }
        }
    });
}

executor.shutdown();
while (!executer.isShutdown() {
    Thread.sleep(100);
}

st1.close();

mysqlconn.close();
mysqlconn2.close();

public void addLine() {
    syncronized (OBJECT_LOCK) {
        iLine++;
    }
}