我一直在尝试在此循环中获取图像的“src”属性,但无法弄清楚为什么它没有返回任何内容:
require 'nokogiri'
require 'open-uri'
url = "https://marketplace.asos.com/boutiques/independent-label"
doc = Nokogiri::HTML(open(url))
label = doc.css('#boutiqueList')
label.css('#boutiqueList img').attr('src').each do |l|
p l
end
这是HTML:
<ul class="itemList boutiques" id="boutiqueList">
<li class="">
<div class="item landscapemedium" rel="sisterhood">
<div class="image">
<a href="/boutique/sisterhood" class="view-collection">
<img alt="" src="https://marketplace-images.asos.com/2016/12/23/0d664728-f484-447d-b927-679f55f24c1a_medium.jpg" class="">
答案 0 :(得分:1)
以这种方式检查每个元素中的src
属性:
label.css('#boutiqueList img').each { |l| p l.attr('src') }
"https://marketplace-images.asos.com/2016/12/23/0d664728-f484-447d-b927-679f55f24c1a_medium.jpg"
"https://marketplace-images.asos.com/2017/02/03/f6322297-4400-4f18-b76e-66eedfc3f620_medium.jpg"
"https://marketplace-images.asos.com/2016/10/12/2d556841-7c0c-436a-a6fd-37b333c04cfe_medium.jpg"
...
=> 0
您要做的是获取包含与src
匹配的所有'#boutiqueList img'
属性的数组,然后您可以使用map
代替each
:< / p>
label.css('#boutiqueList img').map { |l| p l.attr('src') }
=> ["https://marketplace-images.asos.com/2016/12/23/0d664728-f484-447d-b927-679f55f24c1a_medium.jpg", ...]