jsoup图像没有被解析

时间:2012-07-27 07:18:51

标签: java jsoup

我正在使用jsoup从fallowing网页中检索图像 { http://www.jcpenney.com/dotcom/jewelry-watches/fine-jewelry/mens-jewelry/bulova%25c2%25ae-mens-stainless-steel-watch/prod.jump?ppId=180d97e&catId=cat100240089&selectedLotId=0514592&selectedSKUId=05145920000&navState=navState-:catId-cat100240089:subcatId-:subcatZone-false:N-100240089%20158:Ns-:Nao-0:ps-24:pn-1:Ntt-:Nf-:action-guided%20navigation&catId=SearchResults } 我的代码是

String url = "http://www.jcpenney.com/dotcom/jewelry-watches/fine-jewelry/mens-jewelry/bulova%25c2%25ae-mens-stainless-steel-watch/prod.jump?ppId=180d97e&catId=cat100240089&selectedLotId=0514592&selectedSKUId=05145920000&navState=navState-:catId-cat100240089:subcatId-:subcatZone-false:N-100240089%20158:Ns-:Nao-0:ps-24:pn-1:Ntt-:Nf-:action-guided%20navigation&catId=SearchResults";


           Document doc= Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.120 Safari/535.2").get();


          String imgUrl=doc.select("#mapImageSjElement4 img").attr("abs:src"); 

它应该返回我的图像网址,但我没有得到图像url.any建议????? 我想要检索网页左侧的主要产品图片。

1 个答案:

答案 0 :(得分:0)

如果您打印整个文档,您会看到该图像以及网站内的更多内容是由遍布页面的javascript scrips加载的。为了获得该图像,您必须在2:

之间进行选择
  1. 使用像Selenium,W​​ebdriver,HTTPClient这样的无GUI网络浏览器;并在满载页面时,获取它的HTML内容
  2. 通过研究代码来模拟javascript,并检索所需的数据
  3. 这是一种使用我提到的第二个方法而不向你的项目添加任何额外的lib的方法:

    //Let's say you have the right script in a String
    //variable named javascript.
    String[] html = javascript.split("\n");
    
    String imgUrl = "";
    for(String line : html) {
        if (line.contains("imgUrl variable name here")) {
            imgUrl = line;
            break;
        }
    }
    
    //Now that you have what you want in a variable
    //just split / substring it, untill you narrowed
    //it down to what you want.