从网页解析得到405 Not Allowed

时间:2017-09-22 20:45:26

标签: ruby web-scraping nokogiri open-uri

在问这个问题之前,我一直在寻找解决方案,但不幸的是,没有一个产生好的结果。 访问此特定网址时,我会收到OpenURI::HTTPError: 405 Not Allowed

require 'open-uri'
doc = Nokogiri::HTML(open("http://streeteasy.com"))

#=> OpenURI::HTTPError: 405 Not Allowed
  from /Users/cyrusghazanfar/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/open-uri.rb:358:in `open_http'

也尝试过:

$ curl -I http://streeteasy.com

返回:

HTTP/1.1 405 Not Allowed
Date: Fri, 22 Sep 2017 20:03:59 GMT
Content-Type: text/html
Connection: keep-alive
Server: nginx
X-DZ: 24.193.31.96
Vary: Accept-Encoding
X-DZ: 127.0.0.1
Expires: Thu, 01 Jan 1970 00:00:01 GMT
Cache-Control: private, no-cache, no-store, must-revalidate
Edge-Control: no-store, bypass-cache
Surrogate-Control: no-store, bypass-cache

1 个答案:

答案 0 :(得分:3)

问题是服务器需要User-Agent标头才能工作,所以在curl中它会是这样的:

curl --header "User-Agent: Mozilla/5.0" http://streeteasy.com