Nokogiri如何获得img src

时间:2017-04-03 15:40:33

标签: ruby-on-rails ruby nokogiri

我一直在尝试在此循环中获取图像的“src”属性,但无法弄清楚为什么它没有返回任何内容:

require 'nokogiri'
require 'open-uri'

url = "https://marketplace.asos.com/boutiques/independent-label"

doc = Nokogiri::HTML(open(url))

label = doc.css('#boutiqueList')
label.css('#boutiqueList img').attr('src').each do |l|
    p l
end

这是HTML:

    <ul class="itemList boutiques" id="boutiqueList">

        <li class="">


<div class="item landscapemedium" rel="sisterhood">
    <div class="image">
        <a href="/boutique/sisterhood" class="view-collection">     
        <img alt="" src="https://marketplace-images.asos.com/2016/12/23/0d664728-f484-447d-b927-679f55f24c1a_medium.jpg" class="">

1 个答案:

答案 0 :(得分:1)

以这种方式检查每个元素中的src属性:

label.css('#boutiqueList img').each { |l| p l.attr('src') }
"https://marketplace-images.asos.com/2016/12/23/0d664728-f484-447d-b927-679f55f24c1a_medium.jpg"
"https://marketplace-images.asos.com/2017/02/03/f6322297-4400-4f18-b76e-66eedfc3f620_medium.jpg"
"https://marketplace-images.asos.com/2016/10/12/2d556841-7c0c-436a-a6fd-37b333c04cfe_medium.jpg"
...
=> 0

您要做的是获取包含与src匹配的所有'#boutiqueList img'属性的数组,然后您可以使用map代替each:< / p>

label.css('#boutiqueList img').map { |l| p l.attr('src') }
=> ["https://marketplace-images.asos.com/2016/12/23/0d664728-f484-447d-b927-679f55f24c1a_medium.jpg", ...]