如何使用imacros刮取图像网址

时间:2016-09-18 02:37:47

标签: javascript html imacros

我正在使用imacross来搜索网站内容,但我一直试图从以下标记中删除图片网址。

<div class="dpimages-icons-box">
   <a href="http://host1.com/1.jpg" class="lightbox" title="9558" rel="dpimages"><img src="//host2.com/9558.jpg" alt="9558" title="9558"  width="80" height="54" /></a>
   <a href="http://host1.com/2.jpg" class="lightbox" title="9559" rel="dpimages"><img src="//host2.com/9559.jpg" alt="9559" title="9559"  width="80" height="67" /></a>
   <a href="http://host1.com/3.jpg" class="lightbox" title="9560" rel="dpimages"><img src="//host2.com/9560.jpg" alt="9560" title="9560"  width="78" height="80" /></a>
   <a href="http://host1.com/4.jpg" class="lightbox" title="9561" rel="dpimages"><img src="//host2.com/9561.jpg" alt="9561" title="9561"  width="53" height="80" /></a>
   <a href="http://host1.com/5.jpg" class="lightbox" title="9562" rel="dpimages"><img src="//host2.com/9562.jpg" alt="9562" title="9562"  width="52" height="80" /></a>
   <a href="http://host1.com/6.jpg" class="lightbox" title="9562" rel="dpimages"><img src="//host2.com/9562.jpg" alt="9562" title="9562"  width="52" height="80" /></a>
   <a href="http://host1.com/7.jpg" class="lightbox" title="9562" rel="dpimages"><img src="//host2.com/9562.jpg" alt="9562" title="9562"  width="52" height="80" /></a>
   <div class="clearing"></div>
  </div>

如何提取第一个n图片的网址,如:

http://host1.com/1.jpg
http://host1.com/2.jpg
http://host1.com/3.jpg
http://host1.com/4.jpg
http://host1.com/5.jpg

使用imacros并保存到.csv文件?

1 个答案:

答案 0 :(得分:0)

尝试应用以下宏:

SET !EXTRACT_TEST_POPUP NO
TAG POS=1 TYPE=A ATTR=CLASS:lightbox&&REL:dpimages EXTRACT=HREF
TAG POS=2 TYPE=A ATTR=CLASS:lightbox&&REL:dpimages EXTRACT=HREF
TAG POS=3 TYPE=A ATTR=CLASS:lightbox&&REL:dpimages EXTRACT=HREF
TAG POS=4 TYPE=A ATTR=CLASS:lightbox&&REL:dpimages EXTRACT=HREF
TAG POS=5 TYPE=A ATTR=CLASS:lightbox&&REL:dpimages EXTRACT=HREF
SAVEAS TYPE=EXTRACT FOLDER=D:\Scrape\ FILE=pic.csv