使用JSoup提取所有课程名称

时间:2015-11-03 02:41:42

标签: java jsoup

我是这个网络抓取业务的新手。我一直试图从Udacity那里获取所有的课程标题,但不幸的是我没有成功

有人能指出我正确的方向吗?提前谢谢。

public static void main(String[] args) 
{
    // TODO Auto-generated method stub
    Document doc;

    try
    {

        doc = Jsoup.connect("https://www.udacity.com/courses/all").get();


        //Extract Header "1"
        //Element titleWiki = doc.select("h1,h-slim-top").first();

        Elements Contents = doc.select("h3");


        System.out.println(Contents.size());



        for(Element courseTitle:Contents)
            System.out.println("\nCourse Titles " + courseTitle.text());

    }

        catch(IOException e){

        }



    }

}

1 个答案:

答案 0 :(得分:0)

希望它会对你有所帮助。

String url = "https://www.udacity.com/public-api/v0/courses";
Document doc = Jsoup
        .connect(url)
        .referrer("https://www.udacity.com/courses/all")
        .userAgent(
                "Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
        .ignoreContentType(true).get();

String jsonData = doc.body().text();

try {

    JSONObject obj = new JSONObject(jsonData);
    JSONArray courses = obj.getJSONArray("courses");

    for (int i = 0; i < courses.length(); i++) {

        JSONObject course = (JSONObject) courses.get(i);
        String courseName = course.getString("title");
        System.out.println(courseName);
    }

} catch (JSONException e) {

}

阅读它。 https://s3.amazonaws.com/content.udacity-data.com/techdocs/UdacityCourseCatalogAPIDocumentation-v0.pdf