我在spiders / spidername.py中有以下代码:
# python 3
import scrapy
from urllib.parse import urljoin
class PycoderSpider(scrapy.Spider):
name = "auru"
start_urls = [
'http://example.com',
]
def parse(self, response):
for post_link in response.xpath(
'//div[@class="post mb-2"]/h2/a/@href').extract():
url = urljoin(response.url, post_link)
print(url)
我需要更改(我不熟悉Python)从url site.com获取来自div.className
的内容根据以下掩码 - site.com/id
其中id等于101的数字到100101,如果它存在?