Question

我正在使用Nokogiri，此时，我的变量包含某些页面的代码：doc = Nokogiri::HTML(open(page))。代码中有脚本，ajax调用：

<script type="text/javascript" charset="utf-8">         
      $(document).ready(function(){
        $("#menu").kendoMenu();    
        $('.menu_item').on('click', function (e){
          $.ajax({
            url: '/movie/101299-the-hunger-games-catching-fire/images?kind=backdrop&language=' + $(this).attr('alt') + '&translate=false',
            cache: false
          }).done(function(response) {
            $('#image_panel').html(response);
          });
        });

        $.ajax({
          url: '/movie/101299-the-hunger-games-catching-fire/images?kind=backdrop&language=&translate=false', //goal
          cache: false
        }).done(function(response) {
          $('#image_panel').html(response);
        });   
      });        
</script>

有一些方法可以获取第二个请求网址，并将其放入变量，我需要转到此网址。不幸的是我没有找到关于它的东西，也许phantomjs可以帮助我吗？

Answer 1

我认为您将手动解析脚本元素。您可以使用Nokogiri来获取脚本元素的文本。然后使用正则表达式查找最后一个网址：

假设脚本是页面上的第一个脚本，您可以执行以下操作：

url = doc.at_css('script').text.scan(/url: '(.*)'/).last.first

以下内容将脚本分解为每个步骤的说明：

# Get the text of the script element
# Note that this assumes it is the first script element (you may need to be more specific)
script = doc.at_css('script').text

# Find all urls in the script
urls = script.scan(/url: '(.*)'/)

# Of the urls found, take the last one
url = urls.last

# url is actually an array of length 1, since we used a matching group in the regex
# Take the first element of the array to get the url as a string
url = url.first
#=> "/movie/101299-the-hunger-games-catching-fire/images?kind=backdrop&language=&translate=false"

在某个网页上的ajax调用中获取请求的页面<url> </url>

1 个答案: