我已经在rails app中安装了mechanize gem并对其进行测试我只是将下面的代码复制并粘贴到irb控制台中。它登录到页面,我可以将Orange放入搜索字段并提交,但是下一页没有内容与" Orange"我在浏览器中看到的任何橙色员工也没有。 linkedin有一些安全功能可以阻止这种情况,还是我做错了什么?
require 'rubygems'
require 'mechanize'
require 'nokogiri'
require 'open-uri'
#create agent
agent = Mechanize.new { |agent|
agent.user_agent_alias = 'Mac Safari 4'
}
agent.follow_meta_refresh = true
#visit page
page = agent.get("https://www.linkedin.com/")
#login
login_form = page.form('login')
login_form.session_key = "email"
login_form.session_password = "password"
page = agent.submit(login_form, login_form.buttons.first)
# get the form
form = agent.page.form_with(:name => "commonSearch")
#fill form out
form.keywords = 'Orange France'
# get the button you want from the form
button = form.button_with(:value => "Search")
# submit the form using that button
agent.submit(form, button)
agent.page.link_with(:text => "Orange")
=> nil
答案 0 :(得分:1)
Mechanize的问题是它无法直接使用JavaScript加载的内容,就像使用LinkedIn搜索在此场景中找到的那样。
此解决方案是查看页面的正文并使用正则表达式获取所需内容,然后将结果解析为JSON。
例如:
url = "http://www.linkedin.com/vsearch/p?type=people&keywords=dario+barrionuevo"
results = agent.get(url).body.scan(/\{"person"\:\{.*?\}\}/)
person = results.first # You'd use an each here, but for the example we'll get the first
json = JSON.parse(person)
json['person']['firstName'] # => 'Dario'
json['person']['lastName'] # => 'Barrionuevo'