如何从<a href="" tag?=""

时间:2015-06-22 10:42:48

标签: java html matcher

="" I'm trying to extract specific data from html <a> Tag. I want to extract the imgurl and the surl. Here is the HTML code:

<a href="/images/search?q=nba&amp;view=detailv2&amp;&amp;&amp;
id=FE19E7BB2916CE8B6CD78148F3BC0656D151049A&amp;
selectedIndex=3&amp;
ccid=2%2f7OBkGc&amp;
simid=608035681734625885&amp;
thid=JN.tdPCsRj4HyJzbwA%2bgXsS8g" 
ihk="JN.tdPCsRj4HyJzbwA+gXsS8g" 
m="{ns:&quot;images&quot;,k:&quot;5070&quot;,dirovr:&quot;ltr&quot;,
mid:&quot;FE19E7BB2916CE8B6CD78148F3BC0656D151049A&quot;,
surl:&quot;http://www.nba.com/gallery/rookie/070727_1.html&quot;,
imgurl:&quot;http://www.nba.com/media/draft_class_3_07_070727.jpg
&quot;,
ow:&quot;300&quot;,docid:&quot;608035681734625885&quot;,oh:&quot;192&quot;,tft:&quot;58&quot;}" 
mid="FE19E7BB2916CE8B6CD78148F3BC0656D151049A" 
t1="The 2007 NBA Draft Class" 
t2="625 x 400 · 374 kB · jpeg" 
t3="www.nba.com/gallery/rookie/070727_1.html" 
h="ID=images,5070.1"><img data-bm="16" 
src="https://tse3.mm.bing.net/th?id=JN.tdPCsRj4HyJzbwA%2bgXsS8g&amp;w=217&amp;h=142&amp;c=7&amp;rs=1&amp;qlt=90&amp;o=4&amp;pid=1.1" 
style="width:217px;height:142px;" width="217" height="142">
</a>

and here is how I tried to extract it:

String title = "dog";

        String url =    "https://www.bing.com/images/search?q="+title+"&FORM=HDRSC2";


        try {
            Document doc = Jsoup.connect(url).get();
            Elements img = doc.getElementsByTag("a");


            for (Element el : img) {
                String src = el.absUrl("imgurl");

                System.out.println(src);

            }
        } catch (IOException e) {
            e.printStackTrace();
        }

    }

Please help me! I hope you understand my problem. Waiting for answers..

0 个答案:

没有答案