我需要解析http://developer.android.com/about/dashboards/index.html以将第一个表(“平台版本”)中的数据导入我的mysql数据库。 不幸的是,我的程序选择了最后一个表并将其数据写入数据库。 实际上它应该持有标有“表”的alle元素并选择这3个中的第一个...任何建议出了什么问题?在堆栈上找不到任何解决方案
问候,olli
ArrayList<Table>tableList = new ArrayList<Table>();
String URL ="jdbc:mysql://localhost:3306/crawler";
String USER = "root";
String PASSWORD = "";
String DRIVER = "com.mysql.jdbc.Driver";
Connection conn = null;
try {
Class.forName(DRIVER);
//Connect to MySQL database
conn=DriverManager.getConnection(URL, USER, PASSWORD);
Statement stmt=conn.createStatement();
URL url = new URL("http://developer.android.com/about/dashboards/index.html");
//Connect to URL
URLConnection con = url.openConnection();
con.setDoOutput(true);
Document doc = Jsoup.parse(con.getInputStream(), "UTF-8", "http://developer.android.com/about/dashboards/index.html");
Elements allTables = doc.getElementsByTag("table"); //hold all tables
Element table = allTables.get(0); // using first table
Elements row = table.getElementsByTag("tr"); //each row
for (Element link:row){
Elements cell = link.getElementsByTag("td"); // each cell per row
int count =0;
Table table1 = new Table();
for(Element link1:cell){
String linkText=link1.text(); //each cell value
if(count == 0){
table1.setVersion(linkText);
}else if (count == 1){
table1.setCodename(linkText);
}else if(count == 2){
table1.setApi(Integer.parseInt(linkText));
}else if (count == 3){
table1.setDistribution(Float.parseFloat(linkText));
}
count++;
}
if(count !=0){
tableList.add(table1);
}
}
for (Table table1:tableList){
stmt.executeUpdate("INSERT INTO verteilung_android2 (Version, Codename, API, Distribution) "
+ "VALUES ('" +table1.getVersion()+ "','" +table1.getCodename()+ "','"+table1.getApi()+"','" +table1.getDistribution()+"')");
System.out.println(tableList);
}
}
catch (SQLException e){
e.printStackTrace();
}catch (IOException e){
e.printStackTrace();
}catch (ClassNotFoundException e){
e.printStackTrace();
}
}
答案 0 :(得分:0)
问题是你想要的信息不是表格,它是这样的图表的java脚本代码:
<script>
var VERSION_DATA =
[
{
"chart": "//chart.googleapis.com/chart?chl=Froyo%7CGingerbread%7CIce%20Cream%20Sandwich%7CJelly%20Bean%7CKitKat&chf=bg%2Cs%2C00000000&chd=t%3A0.8%2C14.9%2C12.3%2C58.4%2C13.6&chco=c4df9b%2C6fad0c&cht=p&chs=500x250",
"data": [
{
"api": 8,
"name": "Froyo",
"perc": "0.8"
},
{
"api": 10,
"name": "Gingerbread",
"perc": "14.9"
},
{
"api": 15,
"name": "Ice Cream Sandwich",
"perc": "12.3"
},
{
"api": 16,
"name": "Jelly Bean",
"perc": "29.0"
},
{
"api": 17,
"name": "Jelly Bean",
"perc": "19.1"
},
{
"api": 18,
"name": "Jelly Bean",
"perc": "10.3"
},
{
"api": 19,
"name": "KitKat",
"perc": "13.6"
}
]
}
];
Jsoup无法解析javascript代码,但您可以手动解析此代码。 你可以做这样的事情,它不是你想要的,但它会给你一个开始的想法。
Element script = doc.select("script").get(7); // Get the script part for chart.
String scriptText = script.toString();
String lines[] = scriptText.split("\\r?\\n"); //thiw will split the String line by line
for (int i = 0; i < lines.length; i++) {
String line = lines[i];
if(line.contains("api"))
System.out.println(line);
if(line.contains("name"))
System.out.println(line);
if(line.contains("perc"))
System.out.println(line);
}
也许,在互联网上有一个解析javascript代码的库。但我不知道这件事。