如何使用POST请求发送表单数据?

时间:2014-07-14 13:42:04

标签: ruby post http-post httparty open-uri

我想在https://www.akzonobel.com/nl/careers/vacatures/网站上查看和抓取工作列表。这个国家必须是荷兰"工作级别为"录入级别"。

我使用httparty发送POST请求,但它会不断返回最初的10个作业列表。正确的答案应该是3个职位列表。

这是我使用的代码:

require 'httparty'
require 'nokogiri'

@base_url = 'https://www.akzonobel.com'

url = "#{@base_url}/careers/vacatures/"

data = {
  'ctl00$contentLeft$ctl01$ddlCountryExt' => 'NLD',
  'ctl00$contentLeft$ctl01$ddlJobLevelExt' => 'ENTRY_LEVEL'
}

response = HTTParty.post("#{@base_url}/nl/careers/vacatures/", :body => data)

html = Nokogiri::HTML(response)

jobs = html.xpath('//h3//a')

jobs.each do |job|
  puts job.text
end

puts jobs.size

返回:

Regional Demand Planner Nordeuropa (m,w)
Forecast Analyst - TiO2 Spend Area
PS Regional Manager APAC
Production leader
Engineering Administrator - Temporary
Procurement Manager EMEA
Business Analyst, Americas
HR Business Partner Supply Chain and R&D
AS Regional Manager
Business Information Manager
10

如何发送网站所需的表单数据以获得正确的响应?


更新

我尝试过以下方法:

require 'httparty'
require 'nokogiri'

@base_url = 'https://www.akzonobel.com'

url = "#{@base_url}/careers/vacatures/"

data = {
  'ctl00$contentLeft$ctl01$ddlCountryExt' => 'NLD',
  'ctl00$contentLeft$ctl01$ddlJobLevelExt' => 'ENTRY_LEVEL',
  'ctl00$contentLeft$ctl01$ddlContinentExt' => 1,
  'ctl00$contentLeft$ctl01$ddlRegionEx' => 4,
  'ctl00$contentLeft$ctl01$ddlJobFamilyEx' => 45,
  'ctl00$contentLeft$ctl01$ddlBusinessUnitExt' => 22,
  'ctl00$contentLeft$ctl01$ddlJobLevelExt' => 1,
  'ctl00$contentLeft$ctl01$ddlCountryExt' => 1,
}

response = HTTParty.post("#{@base_url}/nl/careers/vacatures/", :body => data)

html = Nokogiri::HTML(response)

jobs = html.xpath('//h3//a')

jobs.each do |job|
  puts job.text
end

puts jobs.size

不幸的是结果完全相同。


更新2:

这是更新后的代码:

require 'httparty'
require 'nokogiri'

@base_url = 'https://www.akzonobel.com'

url = "#{@base_url}/careers/vacatures/"

data = {
  'contentLeft_ctl01_ddlContinentExt' => 'C_EUROPE',
  'contentLeft_ctl01_ddlCountryExt' => 'NLD',
  'contentLeft_ctl01_ddlRegionExt' => 'Gelderland',
  'contentLeft_ctl01_ddlRegionExt' => 'Limburg',
  'contentLeft_ctl01_ddlRegionExt' => 'North Holland',
  'contentLeft_ctl01_ddlRegionExt' => 'South Holland',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'General Management',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Integrated Supply Chain',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Sales & Marketing',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'RD&I',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Support',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Other',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Lvl2_General Management',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Manufacturing',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'HSE',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Engineering',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Procurement',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Distribution & Logistics',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Sales',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Marketing',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Lvl2_RD&I',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Finance',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'IM',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'HR',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Legal, IP & Compliance',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Facilities',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Lvl2_Other',
  'contentLeft_ctl01_ddlJobFamilyExt' => '80200000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '80300000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81900000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81100000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '82000000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81200000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '80700000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '80400000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '80500000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '80800000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '80900000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '82100000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '82200000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81010000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81020000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81030000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81040000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81300000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81410000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81420000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81430000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81600000',
  'contentLeft_ctl01_ddlJobFamilyExt' => '81700000',
  'contentLeft_ctl01_ddlJobFamilyExt' => 'Lvl3_Other',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '52000100',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '52000200',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '52000300',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '52000900',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000010',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000013',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000020',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000022',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000026',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000033',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000038',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000041',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000054',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000055',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000056',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000061',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000063',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000100',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000300',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000900',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '53000901',
  'contentLeft_ctl01_ddlBusinessUnitExt' => '51000000',
  'contentLeft_ctl01_ddlJobLevelExt' => 'ENTRY_LEVEL'
}

response = HTTParty.post("#{@base_url}/nl/careers/vacatures/", :body => data)

html = Nokogiri::HTML(response)

jobs = html.xpath('//h3//a')

jobs.each do |job|
  puts job.text
end

puts jobs.size

给我与以前完全相同的结果。

2 个答案:

答案 0 :(得分:0)

我认为可以通过将这段代码更改为仅输出job.text 3次的循环来解决问题。

所以改变这个,

jobs.each do |job|
  puts job.text
end

到此,

for (i=0; i < 3; i++) {
 puts job.text
}

答案 1 :(得分:-1)

在GUI中设置country / joblevel时会触发JavaScript调用。您必须明确地将所有下拉列表值(ContinentRegionJob FamilyBusiness Unit)设置为在设置NLD / EntryLevel后给出的值:1分别为4,45,22。

另一件事是隐藏了真正的控件,使用Chrome Inspector查看。实际控件的id看起来像是:

contentLeft_ctl01_ddlCountryExt 

希望它有所帮助。