我收集了大约2 * 10 ^ 5行的文本,我必须在文本中展开缩短的URL。问题是,某些扩展的URL会再次重定向到缩短的URL,这意味着原始URL缩短了2倍或更多倍。如何有效地处理?因为如果我做了一个while循环,那就花了太多时间。
以下代码的问题是,我的MySQL数据库大约有2 * 10 ^ 5个上下文切换。我从每个文本中提取URL,然后展开它。这将生成一个http请求,然后必须将其输入我的数据库。这花费了太多时间。
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.Proxy;
import java.net.URL;
public class urlExpander {
public static void main(String[] args) throws IOException {
String shortenedUrl = "https://www.youtube.com/watch?v=4oZTPGvG3s8&feature=youtu.be";
String expandedURL = expandUrl(shortenedUrl);
System.out.println(shortenedUrl + "-->" + expandedURL);
}
public static String expandUrl(String shortenedUrl) throws IOException {
URL url = new URL(shortenedUrl);
// open connection
HttpURLConnection httpURLConnection = (HttpURLConnection) url.openConnection(Proxy.NO_PROXY);
// stop following browser redirect
httpURLConnection.setInstanceFollowRedirects(false);
// extract location header containing the actual destination URL
String expandedURL = httpURLConnection.getHeaderField("Location");
httpURLConnection.disconnect();
return expandedURL;
}
}