一切都可以在我的计算机上进行。我将文件上传到服务器上,并且错误不断。在下载第一个图像时停止处理。
目标路径'/home/matms/django_project/media_root/xxx.jpg' 已经存在
def scrape(request):
filelist = glob.glob(os.path.join("/home/matms/django_project/media_root", "*.jpg"))
for f in filelist:
os.remove(f)
old_articles = Weather.objects.all()
old_articles.delete()
user_prof = UserProfile.objects.filter(user=request.user).first()
if user_prof is not None:
user_prof.last_scrape = datetime.now(timezone.utc)
user_prof.save()
session = requests.Session()
session.headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/"}
url = 'https://www.onet.pl/'
content = session.get(url, verify=False).content
soup = BeautifulSoup(content, "html.parser")
posts = soup.find_all('div', {'class': 'sectionLine'})
for post in posts:
title = post.find('span', {'class': 'title'}).get_text()
link = post.find("a")['href']
image_source = post.find('img')['src']
image_source_solved = "http:{}".format(image_source)
media_root = '/home/matms/django_project/media_root'
if not image_source_solved.startswith(("data:image", "javascript")):
#exists = os.path.isfile(media_root+image_source_solved)
exists = 1
if exists == 2:
pass
else:
local_filename = image_source_solved.split('/')[-1].split("?")[0]+".jpg"
r = session.get(image_source_solved, stream=True, verify=False)
with open(os.path.join(settings.MEDIA_ROOT, local_filename), 'wb') as f:
for chunk in r.iter_content(chunk_size=1024):
f.write(chunk)
current_image_absolute_path = os.path.abspath(local_filename)
shutil.move(current_image_absolute_path, media_root)
new_headline = Headline()
new_headline.title = title
new_headline.url = link
new_headline.image = local_filename
new_headline.save()
sleep(1)
错误日志:
> [Thu Sep 26 10:07:31.560075 2019] [wsgi:error] [pid 10690:tid 140110581073664] [remote 89.78.216.206:56770] File
> "/home/matms/django_project/news/views.py", line 102, in scrape
> [Thu Sep 26 10:07:31.560078 2019] [wsgi:error] [pid 10690:tid 140110581073664] [remote 89.78.216.206:56770]
> shutil.move(current_image_absolute_path, media_root)
答案 0 :(得分:1)
您正在将文件写入settings.MEDIA_ROOT / local_filename
,然后将其移动到media_root
。
显然,您的服务器上的settings.MEDIA_ROOT
和media_root
是相同的,所以shutil.move
抱怨该文件已经存在。