Question

我收集了大约2 * 10 ^ 5行的文本，我必须在文本中展开缩短的URL。问题是，某些扩展的URL会再次重定向到缩短的URL，这意味着原始URL缩短了2倍或更多倍。如何有效地处理？因为如果我做了一个while循环，那就花了太多时间。

以下代码的问题是，我的MySQL数据库大约有2 * 10 ^ 5个上下文切换。我从每个文本中提取URL，然后展开它。这将生成一个http请求，然后必须将其输入我的数据库。这花费了太多时间。

import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.Proxy;
import java.net.URL;

public class urlExpander {

    public static void main(String[] args) throws IOException {
        String shortenedUrl = "https://www.youtube.com/watch?v=4oZTPGvG3s8&feature=youtu.be";
        String expandedURL = expandUrl(shortenedUrl);

        System.out.println(shortenedUrl + "-->" + expandedURL); 
    }

    public static String expandUrl(String shortenedUrl) throws IOException {
        URL url = new URL(shortenedUrl);    
        // open connection
        HttpURLConnection httpURLConnection = (HttpURLConnection) url.openConnection(Proxy.NO_PROXY); 

        // stop following browser redirect
        httpURLConnection.setInstanceFollowRedirects(false);

        // extract location header containing the actual destination URL
        String expandedURL = httpURLConnection.getHeaderField("Location");
        httpURLConnection.disconnect();

        return expandedURL;
    }
}

递归扩展大文本中的短URL - Java

0 个答案: