我正在使用Nokogiri,此时,我的变量包含某些页面的代码:doc = Nokogiri::HTML(open(page))
。代码中有脚本,ajax调用:
<script type="text/javascript" charset="utf-8">
$(document).ready(function(){
$("#menu").kendoMenu();
$('.menu_item').on('click', function (e){
$.ajax({
url: '/movie/101299-the-hunger-games-catching-fire/images?kind=backdrop&language=' + $(this).attr('alt') + '&translate=false',
cache: false
}).done(function(response) {
$('#image_panel').html(response);
});
});
$.ajax({
url: '/movie/101299-the-hunger-games-catching-fire/images?kind=backdrop&language=&translate=false', //goal
cache: false
}).done(function(response) {
$('#image_panel').html(response);
});
});
</script>
有一些方法可以获取第二个请求网址,并将其放入变量,我需要转到此网址。不幸的是我没有找到关于它的东西,也许phantomjs可以帮助我吗?
答案 0 :(得分:1)
我认为您将手动解析脚本元素。您可以使用Nokogiri来获取脚本元素的文本。然后使用正则表达式查找最后一个网址:
假设脚本是页面上的第一个脚本,您可以执行以下操作:
url = doc.at_css('script').text.scan(/url: '(.*)'/).last.first
以下内容将脚本分解为每个步骤的说明:
# Get the text of the script element
# Note that this assumes it is the first script element (you may need to be more specific)
script = doc.at_css('script').text
# Find all urls in the script
urls = script.scan(/url: '(.*)'/)
# Of the urls found, take the last one
url = urls.last
# url is actually an array of length 1, since we used a matching group in the regex
# Take the first element of the array to get the url as a string
url = url.first
#=> "/movie/101299-the-hunger-games-catching-fire/images?kind=backdrop&language=&translate=false"