我很抱歉重新发布,我之前的帖子中的标题令人困惑。在蜘蛛示例(下面的代码)中,我如何使用“pyinstaller”(或其他一些安装程序)来构建可执行文件(例如myspidy.exe),以便最终用户不需要在Windows环境中安装scrapy和python ?安装Python和Scrapy后,通过执行命令“scrapy crawl quotes”来运行spider。最终用户将运行下载并在未预装Python和Scrapy的Windows PC中运行“myspidy.exe”。非常感谢!
import scrapy
class QuotesSpider(scrapy.Spider):
name = "quotes"
def start_requests(self):
urls = [
'http://quotes.toscrape.com/page/1/',
'http://quotes.toscrape.com/page/2/',
]
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
page = response.url.split("/")[-2]
filename = 'quotes-%s.html' % page
with open(filename, 'wb') as f:
f.write(response.body)
self.log('Saved file %s' % filename)
谢谢EVHZ。我按照您的建议对代码进行了更改,并在运行时遇到了以下错误。
D:\craftyspider\spidy\spidy\spiders\dist>.\runspidy
Traceback (most recent call last):
File "spidy\spiders\runspidy.py", line 35, in <module>
File "site-packages\scrapy\crawler.py", line 249, in __init__
File "site-packages\scrapy\crawler.py", line 137, in __init__
File "site-packages\scrapy\crawler.py", line 326, in _get_spider_loader
File "site-packages\scrapy\utils\misc.py", line 44, in load_object
File "importlib\__init__.py", line 126, in import_module
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'scrapy.spiderloader'
[14128] Failed to execute script runspidy
答案 0 :(得分:2)
为了将所有内容保存在python文件中,只需执行以下命令:
python script.py
您可以使用您拥有的代码,并添加一些内容:
import scrapy
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
# useful if you have settings.py
settings = get_project_settings()
# Your code
class QuotesSpider(scrapy.Spider):
name = "quotes"
def start_requests(self):
...
# Create a process
process = CrawlerProcess( settings )
process.crawl(QuotesSpider)
process.start()
将其另存为script.py
。然后,使用pyinstaller:
pyinstaller --onefile script.py
将在名为dist
的子目录中生成捆绑包。