为什么jsoup在android studio和java netbeans之间的工作方式不同?

时间:2015-03-15 03:59:12

标签: jsoup

我有这种方法。

private  static String parsePageHeaderInfo(String urlStr) throws Exception {

    String word_google  = "google";
    String word_twitter = "twitter";

    String title , description , image , content;
    image  = "";

    Document doc = Jsoup.connect(urlStr).userAgent("Mozilla").get();

    title = doc.title();

    if(title.equals(""))
    {
        title= doc.select("meta[property=og:title]").attr("content");
    }

    description  = doc.select("meta[name=description]").attr("content");

    if(description.equals(""))
    {
       description= doc.select("meta[name=keywords]").attr("content");
    }

    if(description.equals(""))
    {         
        description= doc.select("meta[property=og:description]").attr("content");            
    }

    if(description.equals(""))
    {
        description = title;
    }

    Elements src_img = doc.select("img[src~=(?i)\\.(png|jpe?g)]");

    if(src_img.size() > 0 )
    {
       image = src_img.first().attr("content");
    }

    if(image.equals(""))
    {
        image = doc.select("meta[property=og:image]").attr("content");
    } 

    if(image.equals(""))
    {
        src_img = doc.select("link[href~=(?i)\\.(ico)]");
        if(src_img.size() > 0 )
        {
            if(urlStr.contains(word_twitter) && image.equals(""))
            {
                image = src_img.first().attr("href");    
            }
            else
            {
                image = urlStr + src_img.first().attr("href");    
            }
        }
    }         

    if(urlStr.contains(word_google) && image.equals(""))
    {
        image = urlStr + "/images/google_favicon_128.png";
    }

    return title  +  " \n a "+ description + "  \n b" + image ;  

}

        String e = parsePageHeaderInfo("https://www.youtube.com/watch?v=HMUDVMiITOU");
        System.out.println(e);

当我在android studio中执行此代码时,输​​出为:

title : YouTube.
description : YouTube.
image : https:   //www.youtube.com/watch?v=HMUDVMiITOU//s.ytimg.com/yts/favicon-vfldLzJxy.ico.

但在netbeans中输出为:

title : DJ Snake, Lil Jon - Turn Down for What - YouTube.
description : Download the single on iTunes: http://smarturl.it/TD4W Director- Daniels Producer- Judy Craig Co Producer- Jonathan Wang Executive Producer- Candice Ouaknine...
image : https:   //i.ytimg.com/vi/HMUDVMiITOU/hqdefault.jpg.

有什么区别? ,第二个选项是正确的。

1 个答案:

答案 0 :(得分:0)

尝试使用其他用户代理Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1944.0 Safari/537.36。还包括推荐人http://www.google.com

您提供的用户代理可能不够或无效。

您可以找到许多用户代理here