jsoup select不使用整个html?

时间:2016-04-17 16:53:44

标签: java android html networking jsoup

我的错是什么?

Android代码:

ArrayList<String> plan_table = new ArrayList<>();
Element table = doc.select("table").get(1); //First Table: Untis Banner and School Data (Adress, etc.); Second Table: Plan -> So load second plan (index 1)
Elements rows = table.select("tr");
Log.i("SchollgymPlanThread","This are the rows: "+rows.toString());

for (int i = 1; i < rows.size(); i++) { //first row is the col names so skip it
   Element row = rows.get(i);
   Elements cols = row.select("td");
   //Log.i("SchollgymPlanThread", cols.get(0).text());
   plan_table.add(cols.get(0).text());
   if (Pattern.matches("^Klasse .*",cols.get(0).text())) {PlanParsed.put(cols.get(0).text(), new LinkedHashMap<String,List>()); current_class=cols.get(0).text();continue;}
            if (current_class != null) {
                List<String> tmpList = new ArrayList<String>();
                for (int i2 = 1; i2 < cols.size(); i2++) {
                    if (i2 == 2) {continue;} //If Lessons Hour , continue -> Lesson our will be put as key and not in the list
                    tmpList.add(cols.get(i2).text());
                }
                Log.i("SchollgymPlanThread", tmpList.toString());
                if (cols.size() < 2) {continue;}
                PlanParsed.get(current_class).put(cols.get(2).text(), tmpList); //ParsedPlan[current_class] = {lesson_hour:lesson_attributes}
            }

            //if ( row.className() == "list odd" ) {Log.i("SchollgymPlanThread","This is a class: "+cols.get(0).text());}
            //if (cols.get(7).text().equals("down")) {
            //    plan_table.add(cols.get(5).text());
            //}

我没有插入整个java代码,但这是我遇到问题的地方...... 在第4行,它打印出带有td和tr的html代码,但它突然停止。输出的最后一行是:

<td cla

有什么不对吗?我已经检查了源网站......

1 个答案:

答案 0 :(得分:0)

你如何用Jsoup阅读html?我问,因为你可能会达到加载文件的大小限制。如果没有通过maxBodySize()方法告知,Jsoup限制为1M。所以你可能想这样做:

Document doc = Jsoup.connect("YOUR_URL").maxBodySize(0).get();