我尝试将此代码复制到我的机器上(来自这篇帖子:Date Format getting disturb when creating .CSV file in Java)
package com.mufapscraping;
//import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
//import java.util.Collections;
import java.util.Iterator;
//import java.util.List;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class ComMufapScraping {
boolean writeCSVToConsole = true;
boolean writeCSVToFile = true;
//String destinationCSVFile = "C:\\convertedCSV.csv";
boolean sortTheList = true;
boolean writeToConsole;
boolean writeToFile;
public static Document doc = null;
public static Elements tbodyElements = null;
public static Elements elements = null;
public static Elements tdElements = null;
public static Elements trElement2 = null;
public static String Dcomma = ", 2";
public static ArrayList<Elements> sampleList = new ArrayList<Elements>();
public static void createConnection() throws IOException {
System.setProperty("http.proxyHost", "191.1.1.123");
System.setProperty("http.proxyPort", "8080");
String tempUrl = "http://www.mufap.com.pk/nav_returns_performance.php?tab=01";
doc = Jsoup.connect(tempUrl).get();
}
public static void parsingHTML() throws Exception {
for (int i = 1; i <= 1; i++) {
tbodyElements = doc.getElementsByTag("tbody");
//Element table = doc.getElementById("dataTable");
if (tbodyElements.isEmpty()) {
throw new Exception("Table is not found");
}
elements = tbodyElements.get(0).getElementsByTag("tr");
for (Element trElement : elements) {
trElement2 = trElement.getElementsByTag("tr");
tdElements = trElement.getElementsByTag("td");
FileWriter sb = new FileWriter("C:\\convertedCSV2.csv", true);
for (Iterator<Element> it = tdElements.iterator(); it.hasNext();) {
if (it.hasNext()) {
sb.append(" \n ");
}
for (Iterator<Element> it2 = trElement2.iterator(); it.hasNext();) {
Element tdElement = it.next();
sb.append(tdElement.text());
if (it2.hasNext()) {
sb.append(" , ");
}
}
System.out.println(sb.toString());
sb.flush();
sb.close();
}
System.out.println(sampleList.add(tdElements));
/* for (Elements elements2 : zakazky) {
System.out.println(elements2);
}*/
}
}
}
但是,我无法让代码在Eclipse中运行。我得到一个异常错误,当我试图运行它时看起来像这样:
Exception in thread "main" java.net.SocketTimeoutException: connect timed out
at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:85)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1004)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:952)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:851)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:563)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:540)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:227)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:216)
at exportDataFromWebsite.createConnection(exportDataFromWebsite.java:34)
at exportDataFromWebsite.main(exportDataFromWebsite.java:83)
我做了以下事情:
下载并将jsoup jar文件添加到项目路径
以与代码相同的方式命名我的类和csv文件
我很确定问题出在以下几行:System.setProperty("http.proxyHost", "191.1.1.123");
System.setProperty("http.proxyPort", "8080");
String tempUrl = "http://www.mufap.com.pk/nav_returns_performance.php?tab=01";
doc = Jsoup.connect(tempUrl).get();
但我不知道如何更改它以便它可以在我自己的电脑上运行。我怎么知道我的电脑是什么代理?任何帮助将不胜感激!
顺便说一下......我不是为了自己的利益而复制某些代码,我试图在我自己的机器上复制问题并提出一个有用的解决方案 - 但是当我尝试运行它时会遇到错误。感谢
另外,还有一个问题,当我取消注释//String destinationCSVFile = "C:\\convertedCSV.csv";
时,我收到一条错误,说“无效的转义序列&#39; ...为什么会这样?”我认为它是一个字符串所以它允许用双引号?
答案 0 :(得分:0)
你可能不是代理人的背后?然后删除这些行:
private static void setGlobalProxy(String proxyHost, int proxyPort, String authUser, String authPassword) {
if (proxyHost != null && !proxyHost.isEmpty())
System.setProperty("http.proxyHost", proxyHost);
if (proxyPort > 0)
System.setProperty("http.proxyPort", ""+proxyPort);
if (authUser != null && authPassword != null) {
Authenticator.setDefault(new Authenticator() {
public PasswordAuthentication getPasswordAuthentication() {
return new PasswordAuthentication(authUser, authPassword.toCharArray());
}
});
System.setProperty("http.proxyUser", authUser);
System.setProperty("http.proxyPassword", authPassword);
}
}
否则,如果您在代理后面,可能使用身份验证,则可以尝试:
new InputStreamReader(clientSocket.getInputStream(), "UTF-8")