为什么Mechanize得到的结果与人类搜索不同?

时间:2016-02-04 21:17:29

标签: ruby mechanize-ruby

我使用以下代码:

require 'rubygems'
require 'mechanize'
require 'nokogiri'  
require 'open-uri'  
require 'logger'
require 'slowweb'
SlowWeb.limit('linkedin.com', 1, 10)

#create agent
agent = Mechanize.new { |agent| 
  agent.user_agent_alias = 'Mac Firefox'
  agent.log = Logger.new "mech.log" 
}
agent.follow_meta_refresh = true
page = agent.get("https://ca.linkedin.com/")

#login
login_form = page.forms.first
login_form.session_key = "username"
login_form.session_password = "pass"

page = agent.submit(login_form, login_form.buttons.first)
url = agent.get("https://www.linkedin.com/vsearch/f?type=all&keywords=Recruiter+Boston")
results = agent.get(url).body.scan(/\{"person"\:\{.*?\}\}/)
results.each do |person|
  json = JSON.parse(person)
  puts json['person']['firstName'] 
  puts json['person']['lastName']
end

这列出了我当前连接的人,因此我已登录,但在手动搜索时,它会列出Boston Recruiters。

我怀疑我的爬虫被识别并被游戏,但如果你有任何其他想法,我很乐意听到它们。

0 个答案:

没有答案