建立一个网站,为大学和课程提供UCAS网站数据,我们试图将其限制为仅限苏格兰的大学,但以下代码似乎不起作用。表单中的位置是ucas网站上该表单段的输入ID的名称,但现在它仍然显示所有大学。
class PagesController < ApplicationController
def home
require 'mechanize'
mechanize = Mechanize.new
@uninames_array = []
page = mechanize.get('http://search.ucas.com/')
form = page.forms.first
form['Vac'] = '2'
form['AvailableIn'] = '2016'
form['Location'] = 'scotland'
page = form.submit
page.search('li.result h3').each do |h3|
# puts h3.text.strip
end
while next_page_link = page.at('.pager a[text()=">"]')
page = mechanize.get(next_page_link['href'])
page.search('li.result h3').each do |h3|
# puts h3.text.strip
name = h3.text
@uninames_array.push(name)
end
end
end
end
答案 0 :(得分:0)
似乎页面中的CountryCode变量是在javascript中初始化的。这就是请求没有显示预期结果的原因。
Mechanize无法处理javascript环境,但您可以将搜索请求作为get请求发送,您必须将所有参数指定为CountryCode。
示例:
require 'mechanize'
mechanize = Mechanize.new
@uninames_array = []
#page = mechanize.get('http://search.ucas.com/search/providers?CountryCode=&RegionCode=&Lat=&Lng=&Feather=&Vac=2&Query=&ProviderQuery=&AcpId=&Location=scotland&IsFeatherProcessed=True&SubjectCode=&AvailableIn=2016')
page = mechanize.get('http://search.ucas.com/search/providers?CountryCode=3&RegionCode=&Lat=&Lng=&Feather=&Vac=2&Query=&ProviderQuery=&AcpId=&Location=scotland&IsFeatherProcessed=True&SubjectCode=&AvailableIn=2016')
page.search('li.result h3').each do |h3|
name = h3.text
@uninames_array.push(name)
end
while next_page_link = page.at('.pager a[text()=">"]')
page = mechanize.get(next_page_link['href'])
page.search('li.result h3').each do |h3|
name = h3.text
@uninames_array.push(name)
end
end
puts @uninames_array.to_s
如果您需要访问所有国家/地区的数据,页面中会有一个包含它们的javascript:
var countries = [],
regions, geoCordinates;
countries.england = 1;
countries.wales = 2;
countries.scotland = 3;
countries["northern ireland"] = 4;
countries.ni = 4;
countries.ireland = 4;
countries.uk = "1|2|3|4|5";
countries["united kingdom"] = "1|2|3|4|5";
regions = [];
regions["central scotland"] = 301;
regions["channel isles"] = 901;
regions["channel islands"] = 901;
regions["dumfries and galloway"] = 302;
regions["east midlands"] = 101;
regions["east england"] = 102;
regions["east sussex"] = 111;
regions["east wales"] = 201;
regions.fife = 303;
regions.grampian = 304;
regions["isle man"] = 902;
regions.london = 103;
regions.lothian = 305;
regions["mid wales"] = 202;
regions["north east"] = 104;
regions["north east england"] = 104;
regions["north wales"] = 203;
regions["north west"] = 105;
regions["north west england"] = 105;
regions.orkney = 306;
regions["scottish borders"] = 307;
regions["scottish highlands"] = 308;
regions["shetland islands"] = 309;
regions["south east"] = 106;
regions["south east england"] = 106;
regions["south east wales"] = 204;
regions["south wales"] = 205;
regions["south west"] = 107;
regions["south west england"] = 107;
regions.strathclyde = 310;
regions.tayside = 311;
regions["west midlands"] = 108;
regions["west sussex"] = 112;
regions["west wales"] = 206;
regions["yorkshire and humber"] = 109;
regions["yorkshire and the humber"] = 109;
regions.yorkshire = 109;
regions.bedfordshire = 114;
regions.essex = 10201;
regions.kent = 10601;
regions.hampshire = 10602;
regions.cornwall = 10701;
regions["north yorkshire"] = 10901;
regions.midlands = "101|108";
regions.sussex = "111|112";
regions["north england"] = "104|105|109";
regions["northern england"] = "104|105|109";
regions["south england"] = "102|103|106|107|114";
regions["southern england"] = "102|103|106|107|114";
geoCordinates = [];
geoCordinates.jordanstown = "54.68627,-5.88206,0"