Question

我已经在rails app中安装了mechanize gem并对其进行测试我只是将下面的代码复制并粘贴到irb控制台中。它登录到页面，我可以将Orange放入搜索字段并提交，但是下一页没有内容与＆＃34; Orange＆＃34;我在浏览器中看到的任何橙色员工也没有。 linkedin有一些安全功能可以阻止这种情况，还是我做错了什么？

    require 'rubygems'
require 'mechanize'
require 'nokogiri'
require 'open-uri'

#create agent
agent = Mechanize.new { |agent| 
    agent.user_agent_alias = 'Mac Safari 4'
}
agent.follow_meta_refresh = true
#visit page
page = agent.get("https://www.linkedin.com/")

#login
login_form = page.form('login')
login_form.session_key = "email"
login_form.session_password = "password"
page = agent.submit(login_form, login_form.buttons.first)

# get the form
form = agent.page.form_with(:name => "commonSearch")
#fill form out
form.keywords = 'Orange France'
# get the button you want from the form
button = form.button_with(:value => "Search")
# submit the form using that button
agent.submit(form, button)

agent.page.link_with(:text => "Orange")
=> nil

Answer 1

Mechanize的问题是它无法直接使用JavaScript加载的内容，就像使用LinkedIn搜索在此场景中找到的那样。

此解决方案是查看页面的正文并使用正则表达式获取所需内容，然后将结果解析为JSON。

例如：

url = "http://www.linkedin.com/vsearch/p?type=people&keywords=dario+barrionuevo"

results = agent.get(url).body.scan(/\{"person"\:\{.*?\}\}/)

person = results.first # You'd use an each here, but for the example we'll get the first

json = JSON.parse(person)
json['person']['firstName'] # => 'Dario'
json['person']['lastName'] # => 'Barrionuevo'

机械化ruby无法查看linkedin中的所有内容

1 个答案: