使用Java中的Jsoup将HTML表解析为JSON

时间:2017-02-24 19:40:40

标签: html json html-table jsoup html-parsing

我有一个格式如下的HTML表格:

<table>
    <tbody>
        <tr>
            <td>Book1</td>
            <td>Group1</td>
            <td>Code1</td>
            <td>Lesson1</td>
            <td>Day1</td>
            <td>Day2</td>
            <td>Day3</td>
        </tr>
        <tr>
            <td>Book2</td>
            <td>Group2</td>
            <td>Code2</td>
            <td>Lesson2</td>
            <td>Day1</td>
            <td>Day2</td>
            <td>Day3</td>
        </tr>
    </tbody>
</table>

我想用Jsoup解析这个HTML,并输出一个格式如下的JSON字符串:

{
   "Book1": {
      "Group": "Group1",
      "Code": "Code1",
      "Lesson": "Lesson1",
      "Day1": "Day1",
      "Day2": "Day2",
      "Day3": "Day3"
   },
   "Book2": {
      "Group": "Group2",
      "Code": "Code2",
      "Lesson": "Lesson2",
      "Day1": "Day1",
      "Day2": "Day2",
      "Day3": "Day3"
   }
}

我试过这段代码:

public String TableToJson(String source) throws JSONException {
    Document doc = Jsoup.parse(source);
    JSONObject jsonObject = new JSONObject();
    JSONArray list = new JSONArray();
    for (Element table : doc.select("table")) {
        for (Element row : table.select("tr")) {
            Elements tds = row.select("td");
            String Name = tds.get(0).text();
            String Group = tds.get(1).text();
            String Code = tds.get(2).text();

            jsonObject.put("Name", Name); 
            jsonObject.put("Group", Group);
            jsonObject.put("Code", Code);
            list.put(jsonObject);
        }
    }
    return list.toString();
}

但它返回了错误的结果:

[
    {
        "Name": "Book1",
        "Group": "Group1",
        "Code": "Code1"
    },
    {
        "Name": "Book1",
        "Group": "Group1",
        "Code": "Code1"
    }
]

我无法更改表格代码,因为它位于另一台服务器上。

如何使用Java中的Jsoup从输入中获得所需的结果?

1 个答案:

答案 0 :(得分:2)

您的代码存在问题,即您正在尝试使用相同的jsonObject,并且您还在使用您不需要的JsonArray。您需要一个包含objects但不包含array of objects

的对象的对象
public String TableToJson(String source) throws JSONException {   
     Document doc = Jsoup.parse(source);
        JSONObject jsonParentObject = new JSONObject();
        //JSONArray list = new JSONArray();
        for (Element table : doc.select("table")) {
            for (Element row : table.select("tr")) {
                JSONObject jsonObject = new JSONObject();
                Elements tds = row.select("td");
                String Name = tds.get(0).text();
                String Group = tds.get(1).text();
                String Code = tds.get(2).text();
                String Lesson = tds.get(3).text();
                String Day1 = tds.get(4).text();
                String Day2 = tds.get(5).text();
                String Day3= tds.get(6).text();        
                jsonObject.put("Group", Group);
                jsonObject.put("Code", Code);
                jsonObject.put("Lesson", Lesson);
                jsonObject.put("Day1", Day1);
                jsonObject.put("Day2", Day2);
                jsonObject.put("Day3", Day3);
                jsonParentObject.put(Name,jsonObject);
             }
        }
    return jsonParentObject.toString();
}

如果您需要澄清,请告诉我!