这是我的代码:
#test
require 'watir'
url_file =
"file:///home/alain/yo.html"
# same as yo:
yo =
'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<div class="Time">time1</div>
<span class="Locus">locus1</span>
<span class="Locus">locus2</span>
<body text="andale">
<div class="alpha">
<div class="Time">time2</div>
<span class="Locus">locus3</span>
<div class="Time">time3</div>
</div>
</body>
<span class="Locus_xxxx">locus4</span>
<span class="Locus">locus5</span>
<span class="Locus">locus6</span>
</html>'
browser = Watir::Browser.new
browser.goto url_file
result = browser.spans(class: 'Locus_xxxx').map do |sp|
time = sp.preceding_sibling(tag_name: 'body').text
locus = sp.text
"#{time} #{locus}"
end
p result
这里是答案: ...在30秒后超时,等待#“ Locus_xxxx”,:tag_name =>“ span”,:index => 0}-> {:tag_name =>“ body”,:adjacent =>:preceding, index => 0}>要定位的位置(Watir :: Exception :: UnknownObjectException)
请注意,贾斯汀·柯(Justin Ko)的想法是previous_sibling和map方法! 普通的水be要友善:) 这里的想法是从body标签“ andle”中获取文本。 这是来自span标记,其类为Locus_xxxx
答案 0 :(得分:1)
#test
require 'watir'
url_file =
"file:///yo.html"
# same as yo:
yo =
'
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<div class="Time">time1</div>
<span class="Locus_1">locus1</span>
<span class="Locus">locus2</span>
<span text="andale">
<div class="alpha">
<div class="Time">time2</div>
<span class="Locus">locus3</span>
<div class="Time_yyyy">time3</div>
</div>
</span>
<span class="Locus_xxxx">locus4</span>
<span class="Locus">locus5</span>
<span class="Locus">locus6</span>
</html>
'
browser = Watir::Browser.new
browser.goto url_file
result = browser.divs(class: 'Time_yyyy').map do |dv|
locus = dv.parent.parent.preceding_sibling(tag_name: 'span', class: 'Locus_1').text
time = dv.text
"#{time} #{locus}"
end
这有效! 结果是 [“ time3 locus1”] [在4.0秒内完成]
与此相关的主题: Watir scraping sequential elements : so simple, but no