我正在抓取Web数据,解析它并将其写入数据库。这是“插入”部分:
public void insert(String comp, String title, String date, String location, String keyword){
String query = "INSERT INTO "+ dbtablename +" "
+ "(company_name,job_title,date_created,location, platform, keyword) VALUES "
+ "(\""+comp+"\",\""+title+"\",\""+date+"\",\""+location+"\",\""+ platform +"\",\""+keyword + "\");";
OpenConnectionDB();
try {
this.statement = this.connection.createStatement();
this.statement.execute(query);
statement.close();
} catch (SQLException ex) {
Logger.getLogger(Database.class.getName()).log(Level.SEVERE, null, ex);
}
finally {closeConnectionDB();}
closeConnectionDB();
}
连接创建如下:
public void getData(Database c) throws IOException
{
// try {
CSVReader reader = new CSVReader(new FileReader(csvFilename), ';');
String[] row = null;
while((row = reader.readNext()) != null) {
for (int i=0; i< row.length; i=i+2 )
{
System.out.println(row[i].trim());
System.out.println( "C:/Talend/workspace/WEBCRAWLER/output/keywords_"+row[i].trim()+".txt");
Document document = Jsoup.parse(new File("C:/Talend/workspace/WEBCRAWLER/output/keywords_"+row[i].trim()+".txt"), "utf-8");
Elements elements = document.select(".joblisting");
for (Element element : elements)
{
// Parse Data into Elements
Elements jobTitleElement = element.select(".job_title span");
Elements companyNameElement = element.select(".company_name span[itemprop=name]");
Elements locationElement = element.select(".locality span[itemprop=addressLocality]");
Elements dateElement = element.select(".job_date_added [datetime]");
// Strip Data from unnecessary tags
String companyName = companyNameElement.text();
String jobTitle = jobTitleElement.text();
String location = locationElement.text();
String timeAdded = dateElement.attr("datetime");
String cleanJobTitle = jobTitle.replaceAll("\"", "");
c.insert(companyName, cleanJobTitle, timeAdded, location, row[i].trim());
c.closeConnectionDB();
}
}
// return row[rowCount];
}
//...
reader.close();
}
它工作正常,但在某些时候它会引发这个错误:
Jul 29, 2014 1:26:03 PM org.jsoup.examples.Database OpenConnectionDB
SEVERE: null
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1127)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:356)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2502)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2539)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2321)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:832)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:46)
at sun.reflect.GeneratedConstructorAccessor5.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:417)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:344)
at java.sql.DriverManager.getConnection(Unknown Source)
at java.sql.DriverManager.getConnection(Unknown Source)
at org.jsoup.examples.Database.OpenConnectionDB(Database.java:35)
at org.jsoup.examples.Database.insert(Database.java:70)
at org.jsoup.examples.parseEasy.getData(parseEasy.java:65)
at startWorkflow.main(startWorkflow.java:27)
Caused by: java.net.SocketException: Permission denied: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:258)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:306)
... 17 more
感觉就像是在做一个不必要的open()或close()。你在我庞大的代码墙中看到了这些吗? (抱歉)。你能帮我解决这个错误吗?
到目前为止,非常感谢社区!
答案 0 :(得分:1)
这是一个快速重新分解,重用连接并使用预准备语句。我不确定你正在使用什么数据库驱动程序,所以打开/关闭和准备好的语句应该让你进入大球场但很可能在语法上不正确。
private static final String KEYWORD_INSERT = "INSERT INTO " + dbtablename
+ "(company_name, job_title, date_created, location, platform, keyword)"
+ "VALUES(?, ?, ?, ?, ?, ?)";
public void getData(Database c) throws IOException {
try {
Connection connection = c.getDbConnection();
PreparedStatement stmt = connection.prepareStatement(KEYWORD_INSERT);
CSVReader reader = new CSVReader(new FileReader(csvFilename), ';');
String[] row;
while((row = reader.readNext()) != null) {
for (int i=0; i< row.length; i=i+2 ) {
Document document = Jsoup.parse(new File("C:/Talend/workspace/WEBCRAWLER/output/keywords_"+row[i].trim()+".txt"), "utf-8");
Elements elements = document.select(".joblisting");
for (Element element : elements) {
// Parse Data into Elements
Elements jobTitleElement = element.select(".job_title span");
Elements companyNameElement = element.select(".company_name span[itemprop=name]");
Elements locationElement = element.select(".locality span[itemprop=addressLocality]");
Elements dateElement = element.select(".job_date_added [datetime]");
// Strip Data from unnecessary tags
String companyName = companyNameElement.text();
String jobTitle = jobTitleElement.text().replaceAll("\"", "");
String location = locationElement.text();
String timeAdded = dateElement.attr("datetime");
stmt.setString(1, companyName);
stmt.setString(2, jobTitle);
stmt.setString(3, timeAdded);
stmt.setString(4, location);
stmt.setString(5, platform);
stmt.setString(6, row[i].trim());
stmt.executeUpdate();
}
}
}
} finally {
stmt.close();
connection.close();
}
}