如何使用Mechanize点击链接来获取订单的更多细节?

时间:2013-12-09 21:10:33

标签: ruby nokogiri mechanize

在订单索引页面上,我正在迭代所有单一订单(http://pastebin.com/FtiTBXG4)。我想要做的是点击每个单一订单链接,以便能够解析有关订单的更多信息。

但是点击或迭代似乎不起作用。它只是一直点击第一个订单。

这是我正在使用的代码:

require 'mechanize'

a = Mechanize.new

a.get('http://exampleshop.nl/admin/') do |page|

    # Select the login form
    login_form = page.forms.first

    # Insert the username and password
    login_form.username = 'username'
    login_form.password = 'password'

    # Submit the login information
    dashboard_page = a.submit(login_form, login_form.buttons.first)

    # Check if the login was successfull
    puts check_1 = dashboard_page.title == 'Dashboard' ?  "CHECK 1 DASHBOARD SUCCESS" : "CHECK 1 DASHBOARD FAIL"

    # Visit the orders index page to scrape some standard information
    orders_page = a.click(dashboard_page.link_with(:text => /Bestellingen/))

    # pp orders_page # => http://pastebin.com/L3zASer6

    # Check if the visit is successful
    puts check_2 = orders_page.title == 'Bestellingen' ?  "CHECK 2 ORDERS SUCCESS" : "CHECK 2 ORDERS FAIL"

    # Search for all #singleOrder table row's and put them in variable all_single_orders
    all_single_orders = orders_page.search("#singleOrder") 

    # puts all_single_orders.class  # => Nokogiri::XML::NodeSet
    # puts all_single_orders                # => http://pastebin.com/FtiTBXG4
    # pp all_single_orders                  # => http://pastebin.com/UMRxGDn2

    # Scrape the needed information (the actual save to database is omitted)
    all_single_orders.each do |order|
        # Fetch the standard information
        puts orderId = order.search("#orderId").text                # => 259    
        puts customerName = order.search("#customerName").text      # => Firstname Lastname     
        puts orderStatus = order.search("#orderStatus").text        # => Bestelling ontvangen           
        puts orderAmount = order.search("#orderAmount").text            # => € 41,94

        # pp order # => sample of a a single `order` iteration: http://pastebin.com/FkM8DVT8

        # Visit a single order page to fetch more detailed information
        single_order_page = orders_page.link_with(:text => /Bekijk/).click

        # puts single_order_page.class # => Mechanize::Page

        # Print the URI to check what page we're on
        puts single_order_page.uri # => http://www.fonexshop.nl/admin/index.php?route=sale/order/info&token=SOMETOKEN&order_id=259
    end
end

这是输出:

CHECK 1 DASHBOARD SUCCESS
CHECK 2 ORDERS SUCCESS
http://www.exampleshop.nl/admin/index.php?route=sale/order/info&token=e29984974b56db4ba9d3c91a47d26f90&order_id=259
http://www.exampleshop.nl/admin/index.php?route=sale/order/info&token=e29984974b56db4ba9d3c91a47d26f90&order_id=259
http://www.exampleshop.nl/admin/index.php?route=sale/order/info&token=e29984974b56db4ba9d3c91a47d26f90&order_id=259
...

关于如何解决这个问题的任何想法?我使用Ruby 2.0.0和Mechanize 2.7.3。

1 个答案:

答案 0 :(得分:2)

问题在于:

single_order_page = orders_page.link_with(:text => /Bekijk/).click

在这里,您要告诉Mechanize点击页面上第一个带有文字" Bekijk"的链接。请注意,这是在页面上寻找第一个匹配的链接,而不仅仅是在订单中(即表格行)。

我认为你需要在订单中获得链接的href,然后点击带有该href的链接(或直接导航到该链接):

all_single_orders.each do |order|
    # Fetch the standard information
    puts orderLink = order.at_css("a")['href']  #Assuming first link in row

    # Visit a single order page to fetch more detailed information
    single_order_page = orders_page.link_with(:href => orderLink).click

    # Print the URI to check what page we're on
    puts single_order_page.uri # => http://www.fonexshop.nl/admin/index.php?route=sale/order/info&token=SOMETOKEN&order_id=259
end