Question

Snip of my output! 我遇到此错误，但无法解决。这是我的代码：

import scrapy
class torrentSpider(scrapy.Spider):
    name = 'torrent'
    start_urls = ['https://www.1337x.to/series-library/b/1/']
    page_number = 2

def parse(self,response):
    href = response.xpath('.//div[@class="movie-info"]/h3/a/@href').extract()
    for urls in href:
        yield {"Linkss" : "https://1337x.to" + urls}
    for alphabets in list(map(chr, range(ord('a'), ord('z')+1))):
        alpha_url = f'https://www.1337x.to/series-library/{alphabets}/1/'
        last_page = alpha_url.xpath('.//div[@class="pagination"]/ul/li/a/text()')[-2].extract() 
        for numbers in str(self.page_number):
            next_page = "https://www.1337x.to/series-library/" + alphabets + "/" + str(numbers)+"/"
            if self.page_number <= int(last_page) :
                self.page_number += 1
                yield response.follow(next_page,callback=self.parse,dont_filter = True  )

我已经尝试删除“ last_page = alpha_url.xpath（'.// div [@ class =“ pagination”] / ul / li / a / text（）'）[-2] .extract（）“

但是它不起作用。任何帮助将不胜感激。

Answer 1

查看您的代码：

    alpha_url = f'https://www.1337x.to/series-library/{alphabets}/1/'
    last_page = alpha_url.xpath(...)...

您专门将alpha_url设置为字符串。在下一行，您尝试调用字符串没有的方法。

您必须获取html格式的页面信息（就像上面对response所做的一样），并在那个上使用xpath。 xpath无法对URL字符串变量进行操作。

Scrapy：AttributeError：'str'对象在Python网络抓取中没有属性'xpath'

1 个答案: