在为客户的电子邮件地址抓取单个订单(可在此处找到完整的HTML代码:http://pastebin.com/SaLc5jHu)页面(我的OpenCart商店的管理员部分)时,我会将以下内容作为电子邮件地址值:
[email protected]
/* <![CDATA[ */
(function(){try{var s,a,i,j,r,c,l,b=document.getElementsByTagName("script");l=b[b.length-1].previousSibling;a=l.getAttribute('data-cfemail');if(a){s='';r=parseInt(a.substr(0,2),16);for(j=2;a.length-j;j+=2){c=parseInt(a.substr(j,2),16)^r;s+=String.fromCharCode(c);}s=document.createTextNode(s);l.parentNode.replaceChild(s,l);}}catch(e){}})();
/* ]]> */
以下是代码:
require 'mechanize'
a = Mechanize.new
a.get('http://exampleshop.nl/admin/') do |page|
# Select the login form
login_form = page.forms.first
# Insert the username and password
login_form.username = 'username'
login_form.password = 'password'
# Submit the login information
dashboard_page = a.submit(login_form, login_form.buttons.first)
# Check if the login was successfull
puts check_1 = dashboard_page.title == 'Dashboard' ? "CHECK 1 DASHBOARD SUCCESS" : "CHECK 1 DASHBOARD FAIL"
# Visit the orders index page to scrape some standard information
orders_page = a.click(dashboard_page.link_with(:text => /Bestellingen/))
# pp orders_page # => http://pastebin.com/L3zASer6
# Check if the visit is successful
puts check_2 = orders_page.title == 'Bestellingen' ? "CHECK 2 ORDERS SUCCESS" : "CHECK 2 ORDERS FAIL"
# Search for all #singleOrder table row's and put them in variable all_single_orders
all_single_orders = orders_page.search("#singleOrder")
# Scrape the needed information (the actual save to database is omitted)
all_single_orders.each do |order|
# Set links for each order
order_link = order.at_css("a")['href'] #Assuming first link in row
order_id = order.search("#orderId").text # => 259
order_status = order.search("#orderStatus").text # => Bestelling ontvangen
order_amount = order.search("#orderAmount").text # => € 41,94
# Visit a single order page to fetch more detailed information
single_order_page = orders_page.link_with(:href => order_link).click
# Fetch more information
puts first_name = single_order_page.search(".firstName").text
puts last_name = single_order_page.search(".lastName").text
puts email = single_order_page.search(".email").text # => [email protected] /* <![CDATA[ */...
puts postal_code = single_order_page.search(".postalCode").text
puts address = single_order_page.search(".address").text
puts product_quantity = single_order_page.search(".orderQuantity").text
end
end
有什么想法吗?我正在使用Ruby 2.0.0和Mechanize 2.7.3并且设置了CloudFlare。
现在工作。要实现此功能,只需在CloudFlare的“应用”面板(https://www.cloudflare.com/cloudflare-apps)中禁用ScrapeShield电子邮件模糊处理选项。
答案 0 :(得分:0)
由于名为ScrapeShield的CloudFlare应用程序已激活,因此无效。
要实现此功能,只需停用“应用”面板(https://www.cloudflare.com/cloudflare-apps)中的ScrapeShield E-mail obfuscation
选项。