从Shell脚本运行Scrapy

时间:2013-09-08 17:53:20

标签: bash scrapy

我试图从shell脚本中调用一些蜘蛛,如下所示:

#!/bin/bash
source /home/pi/.bashrc
source /usr/local/bin/virtualenvwrapper.sh

workon email-scraper
cdvirtualenv

#Change to dir containing scrapy.cfg
cd ./sitecode/jobscraper
pwd

scrapy crawl site1
scrapy crawl site2
scrapy crawl site3

# Email me the jobs.csv file
python '/home/pi/Documents/jobscraper/scripts/jobs-mail.py'

# Delete the file so a new one is created on next scrape
# sudo rm /home/pi/Documents/jobscraper/csv/jobs.csv

它正确执行的唯一部分是最后的python脚本,它通过电子邮件发送给我一个空的csv文件,因为没有任何一个刮板运行。从BASH运行上述脚本时,我得到以下输出

(email-scraper)pi@raspberrypi ~/.virtualenvs/email-scraper/sitecode $ sudo sh runspiders.sh 
ERROR: Environment 'email-scraper' does not exist. Create it with 'mkvirtualenv email-scraper'.
ERROR: no virtualenv active, or active virtualenv is missing
runspiders.sh: line 9: cd: ./sitecode/jobscraper: No such file or directory
/home/pi/.virtualenvs/email-scraper/sitecode
runspiders.sh: line 13: scrapy: command not found
runspiders.sh: line 14: scrapy: command not found
runspiders.sh: line 15: scrapy: command not found
runspiders.sh: line 16: scrapy: command not found
runspiders.sh: line 17: scrapy: command not found
runspiders.sh: line 18: scrapy: command not found
runspiders.sh: line 19: scrapy: command not found
runspiders.sh: line 20: scrapy: command not found
runspiders.sh: line 21: scrapy: command not found
runspiders.sh: line 22: scrapy: command not found

我是shell脚本的新手。任何人都可以说明如何确保我在调用第一个蜘蛛之前激活virtualenv并更改为正确的目录吗?

修改回复@konsolebox

这就是我从家庭目录手动运行刮刀的方法:

首先,作为raspberry pi的源.bashrc由于某种原因不会自动执行此操作。

source .bashrc

这允许我访问virtualenvwrapper。我可以

pi@raspberrypi ~ $ workon email-scraper
(email-scraper)pi@raspberrypi ~ $ cdvirtualenv

将我放入virtualenv项目目录/home/pi/.virtualenvs/email-scraper

然后我做

cd sitecode/jobscraper

和ls -al让我进入了dir,可以访问scrapy.cfg,这是我运行刮刀所需要的。

drwxr-xr-x 3 pi   pi    4096 Sep  9 19:40 .
drwxr-xr-x 5 pi   pi    4096 Sep  8 19:41 ..
drwxr-xr-x 3 pi   pi    4096 Sep  8 14:59 jobscraper
-rwxr-xr-x 1 pi   pi     632 Sep  8 22:18 runspiders.sh
-rw-r--r-- 1 root root 12288 Sep  9 19:40 .runspiders.sh.swp
-rw-r--r-- 1 pi   pi     381 Sep  7 23:34 scrapy.cfg

然后我可以scrapy crawl site1来运行刮刀。

1 个答案:

答案 0 :(得分:0)

也许你真的需要在~/.virtualenvs/email-scraper/中运行脚本。在运行之前执行cd ~/.virtualenvs/email-scraper/

当你在那里跑

sh sitecode/runspiders.sh

或者

sudo sh sitecode/runspiders.sh


#!/bin/bash

cd /home/pi
source ./bashrc
## source /usr/local/bin/virtualenvwrapper.sh  ## You didn't run this.

workon email-scraper
cdvirtualenv

cd ./sitecode/jobscraper

scrapy crawl site1

bash script.sh运行,而不是sudo sh script.sh