我有这段代码
require 'mechanize'
@agent = Mechanize.new
page = @agent.get('http://something.com/?page=1')
next_page = page.link_with(:href=>/^?page=2/).click
正如您所见,此代码应转到下一页。
next_page
应该有网址http://something.com/?page=2
如何获取next_page
的当前网址?
答案 0 :(得分:23)
next_page.uri.to_s
请参阅http://www.rubydoc.info/gems/mechanize/Mechanize/Page/Link#uri-instance_method和http://ruby-doc.org/stdlib-2.4.1/libdoc/uri/rdoc/URI.html
出于测试目的,我在irb中执行了以下操作:
require 'mechanize'
@agent = Mechanize.new
page = @agent.get('http://news.ycombinator.com/news')
=> #<Mechanize::Page
{url #<URI::HTTP:0x00000001ad3198 URL:http://news.ycombinator.com/news>}
{meta_refresh}
{title "Hacker News"}
{iframes}
{frames}
{links
#<Mechanize::Page::Link "" "http://ycombinator.com">
#<Mechanize::Page::Link "Hacker News" "news">
#<Mechanize::Page::Link "new" "newest">
#<Mechanize::Page::Link "comments" "newcomments">
#<Mechanize::Page::Link "ask" "ask">
#<Mechanize::Page::Link "jobs" "jobs">
#<Mechanize::Page::Link "submit" "submit">
#<Mechanize::Page::Link "login" "newslogin?whence=%6e%65%77%73">
#<Mechanize::Page::Link "" "vote?for=3803568&dir=up&whence=%6e%65%77%73">
#<Mechanize::Page::Link
"Don’t Be Evil: How Google Screwed a Startup"
"http://blog.hatchlings.com/post/20171171127/dont-be-evil-how-google-screwed-a-startup">
#<Mechanize::Page::Link "mikeknoop" "user?id=mikeknoop">
#<Mechanize::Page::Link "64 comments" "item?id=3803568">
#<Mechanize::Page::Link "" "vote?for=3802515&dir=up&whence=%6e%65%77%73">
# Omitted for brevity...
next_page.uri
=> #<URI::HTTP:0x00000001fa7818 URL:http://news.ycombinator.com/news2>
next_page.uri.to_s
=> "http://news.ycombinator.com/news2"