您好我正在尝试从网站上获取一些信息(http://omhc.nl/site/default.asp?Option=10017&m=1)。表结构是:
<tr>
<td colspan="4" style="border-bottom: 1px solid rgb(0, 0, 0);" width="100%">donderdag 19 april 2012 </td>
</tr>
<tr>
<td width="5%"> </td>
<td width="25%">17:00 - 22:00 </td>
<td bgcolor="" width="40%">KM</td>
<td width="6%">Barhoofd </td>
</tr>
<tr>
<td colspan="4" style="border-bottom: 1px solid rgb(0, 0, 0);" width="100%">vrijdag 20 april 2012 </td>
</tr>
<tr>
<td width="5%"> </td>
<td width="25%">16:30 - 19:30 </td>
<td bgcolor="" width="40%">Ouders/verzorgers van VL</td>
<td width="6%">Bardienst </td>
</tr>
<tr>
<td width="5%"> </td>
<td width="25%">16:30 - 19:30 </td>
<td bgcolor="" width="40%">Ouders/verzorgers van AvdN</td>
<td width="6%">Bardienst </td>
</tr>
<tr>
<td width="5%"> </td>
<td width="25%">16:30 - 21:00 </td>
<td bgcolor="" width="40%">EdK</td>
<td width="6%">Barhoofd </td>
</tr>
<tr>
<td width="5%"> </td>
<td width="25%">21:00 - 23:00 </td>
<td bgcolor="" width="40%">FK</td>
<td width="6%">Barhoofd </td>
</tr>
<tr>
<td width="5%"> </td>
<td width="25%">23:00 - 00:00 </td>
<td bgcolor="" width="40%">SW</td>
<td width="6%">Barhoofd </td>
</tr>
我的代码:
package maartenbrakkee.bardienst.omhc;
import java.util.List;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Element;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
import android.app.Activity;
import android.os.Bundle;
import android.text.method.ScrollingMovementMethod;
import android.widget.TextView;
public class BardienstActivity extends Activity {
TextView tv;
static final String URL = "http://omhc.nl/site/default.asp?Option=10017&m=1";
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
tv = (TextView)findViewById(R.id.tv);
tv.setSingleLine(false);
try {
tv.setMovementMethod(new ScrollingMovementMethod());
tv.setText(getBarschema());
} catch (Exception ex) {
tv.setText("Error");
}
}
protected String getBarschema() throws Exception {
String result = "";
Document document = Jsoup.connect(URL).get();
Elements dagen = document.select("tr:has(td[width=100%]) + tr:has(td[width=40%])");
// Dag
String[] dag = new String[dagen.size()];
int i = 0;
for (Element dagen1 : dagen) {
dag[i++] = dagen1.text() + "\n";
}
return dag[3];
}
}
我想得到一天[0]:
donderdag 19 april 2012 + "\n" + 17:00 - 22:00 KM Barhoofd
和第[1]天:
vrijdag 20 april 2012 + "\n" + 16:30 - 19:30 Ouders/verzorgers van VL Bardienst "\n" + 16:30 - 19:30 Ouders/verzorgers van AvdN Bardienst + "\n" + 16:30 - 21:00 EdK Barhoofd + "\n" + 21:00 - 23:00 FK Barhoofd + "\n" 23:00 - 00:00 SW Barhoofd
但我得到的只是17:00 - 22:00 KM Barhoofd的一天[0]。如何选择正确的单元格(从第一个tr td [宽度:100%]到下一个tr td [宽度:100%])?
答案 0 :(得分:0)
我改变了你的方法getBarSchema,主要是你的选择器:
static protected List<String> getBarschema(String URL) throws Exception {
Document document = Jsoup.connect(URL).get();
// New Selector
Elements dagen = document.select("div.content table tr td");
// List better than array in this case
List<String> dag = new ArrayList();
String line = "";
for (Element dagen1 : dagen) {
String width = dagen1.attr("width");
if(width.equals("100%") && !line.equals("")){
dag.add(line);
line ="";
}
line += dagen1.text() + "\n";
}
return dag;
}