我正在尝试选择并获取特定html信息的变量。但首先,我试图尽可能地展示这些信息。问题是我想要提取信息的页面没有明确的类标识符,或者我很难看到如何提取该信息。
这是我的Jsoup代码:
public class MainActivity extends Activity {
private TextView tvmaximo;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
tvmaximo=(TextView)findViewById(R.id.tvmaximo);
new BackGroundTask().execute();
}
@Override
public boolean onCreateOptionsMenu(Menu menu) {
// Inflate the menu; this adds items to the action bar if it is present.
getMenuInflater().inflate(R.menu.main, menu);
return true;
}
class BackGroundTask extends AsyncTask<Void, Void, String> {
@Override
protected void onPreExecute() {
super.onPreExecute();
}
@Override
protected String doInBackground(Void... params) {
try {
URL url= new URL("http://www.myweb.com");
/*Document doc = Jsoup.connect(url.toString()).get();*/
Document doc = Jsoup.connect(url.toString()).get();
/*Elements elements = doc.select(".lyrics").first();*/
//get page title
/*String title = doc.title();*/
Elements elements = doc.select("td.headerRouteText");
String maximo=elements.html();
return maximo;
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
@Override
protected void onPostExecute(String result) {
tvmaximo.setText(result);
System.out.println(result);
super.onPostExecute(result);
}
}
}
这里是我要保存的html:“MÍNIMO”,“MÁXIMO”,“VALOR MEDIO”以及89,99,47,341和17,3。每个值都在一个不同的变量中。总共6个变量:
<tr>
<td align="center">
<table>
<tr><td align="center" class="cabeceraRutaTexto" colspan="2">MÁXIMO </td>
<td align="center" class="cabeceraRutaTexto" colspan="2">VALOR MEDIO </td>
<td class="cabeceraRutaTexto" align="center">MÍNIMO </td>
</tr>
<tr><td align="center" class="cabeceraRutaTexto">89,99 <img
SRC="../Diseno/imagenes/euro.gif" WIDTH="7" HEIGHT="8"> /MWh</td>
<td> </td>
<td align="center" class="cabeceraRutaTexto">47,341 <img
SRC="../Diseno/imagenes/euro.gif" WIDTH="7" HEIGHT="8"> /MWh</td>
<td> </td>
<td align="center" class="cabeceraRutaTexto">17,3 <img
SRC="../Diseno/imagenes/euro.gif" WIDTH="7" HEIGHT="8"> /MWh</td>
</tr>
</table>
</td>
</tr>
</table>
正如您所看到的,创建正则表达式行很困难,因为缺少对的引用 建立它。我怎么能用Jsoup做到这一点?提前感谢您的帮助和时间。
现在,感谢您的所有解释,我已经解决了问题:
public class MainActivity extends Activity {
private TextView tvmaximo;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
tvmaximo=(TextView)findViewById(R.id.tvmaximo);
new BackGroundTask().execute();
}
@Override
public boolean onCreateOptionsMenu(Menu menu) {
// Inflate the menu; this adds items to the action bar if it is present.
getMenuInflater().inflate(R.menu.main, menu);
return true;
}
class BackGroundTask extends AsyncTask<Void, Void, String> {
@Override
protected void onPreExecute() {
super.onPreExecute();
}
@Override
public String doInBackground(Void... params) {
try {
URL url= new URL("http://www.myweb.com");
Document doc = Jsoup.connect(url.toString()).get();
/get elements table 1 (graphic table) */
Elements elementsgraphic = doc.select("div.divns6");
elementsgraphic.size(); //2
/* get elements table 2 (normal table) */
Elements elements = doc.select("td.cabeceraRutaTexto");
elements.size(); // 6
String barra1= elementsgraphic.get(0).text();
/* text values from table 2 */
String titulotxt = elements.get(0).text(); // TÍTULO
String maximotxt = elements.get(1).text(); // TEXTO VALOR MAXIMO
String mediotxt = elements.get(2).text(); // TEXTO VALOR MEDIO
String minimotxt = elements.get(3).text(); // TEXTO VALOR MINIMO
/* numeric values from table 2 */
String maximo = elements.get(4).text(); // NUMERICO VALOR MAXIMO
String medio = elements.get(5).text(); // NUMERICO VALOR MEDIO
String minimo = elements.get(6).text(); // NUMERICO VALOR MINIMO
return maximo;
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
@Override
protected void onPostExecute(String result) {
tvmaximo.setText(result);
/*System.out.println(result);*/
super.onPostExecute(result);
}
}
}
现在我还有两个问题,如何在下一个代码中从“activadiv”中取出“18”和“8”值?我正在尝试:
元素elementsgraphic = doc.select(“div.divns6”); 元素elementsgraphic = doc.select(“div#divns6”); 元素elementsgraphic = doc.select(“changeImage('barra1”);
等......应用程序每次尝试都会崩溃。我知道错误是正则表达式。
另一方面..我一直在尝试使用数组将所有这些变量返回到“OnPostExecute”但程序绑定错误,因为Asynctask不会让我返回一个数组。再次感谢您的耐心等待。
<table>
<tr><td><div name='divns6' id='divns6' style='position:relative;visibility:hidden;'
width='400' height='160'><table valign=botton cellpadding='0' cellspacing='0'
border='0'><tr valign='bottom'>
<td width=15 valign="bottom" height=150><a href="javascript:void(null)"
onMouseOver="changeImage('barra1','','47',2);activadiv('barra0','18');"
onMouseOut="changeImage('barra1','','47',0);desactivadiv('barra1');"><img
NAME="barra1" width="11px" height="47" border="0"></a></td>
<td width=15 valign="bottom" height=150><a href="javascript:void(null)"
onMouseOver="changeImage('barra2','','21',2);activadiv('barra1','8');"
onMouseOut="changeImage('barra2','','21',0);desactivadiv('barra2');"><img
NAME="barra2" width="11px" height="21" border="0"></a></td>
</td></tr></table>
答案 0 :(得分:1)
选择它们并按位置提取,例如:
Elements elements = doc.select("td.cabeceraRutaTexto");
elements.size(); // 6
elements.get(0).text(); // MÁXIMO
elements.get(1).text(); // VALOR MEDIO
...
// or just the one you want
doc.select("td:eq(0).cabeceraRutaTexto").get(0).text() // MÁXIMO
从评论中更新:从给定的html获取18
作为javascript代码的一部分具有另一级别的复杂性,以下代码将提供所需的值,但请记住有更好的方法解析并提取部分javascript。
Document doc = Jsoup.parse(xml);
String onMouseOver = doc.select("a").attr("onMouseOver");
// while this will work, there are more robust ways to parse javascript
onMouseOver.split("'")[9];
答案 1 :(得分:1)
Elements elem = doc.select("td.cabeceraRutaTexto");
for(Element el : elem)
{
Log.e("elements :" , el.text());
}
or
for(int i = 0;i<elem.size();i++)
{
Element el = elem.first();
Log.e("element" + i + ":",el.text());
}