好一点介绍。我正在尝试创建一个机器人,该机器人抓取reddit的图像和帖子并将其推文发布。这不是我的全部项目,而是几乎所有相关的内容。
所以事情是我需要将图像下载到一个目录中,并且将该目录命名为“ pics”,但我还需要将reddit post ID(我使用此处未显示的其他功能)记录到文本文件中我需要通读它,以检查它是否已经被发布过。这就是yet_tweeted函数的作用。
这是问题所在,在我的第一个循环中,它将读取文件(当前为空),然后继续使用get_image函数。我需要将当前目录更改为图像目录,以便可以存储执行的图像(os.chdir(img_dir))。现在目录已更改,当我返回读取oldies.txt文件时,它说它不存在,因为它在原始目录中,而不是pics目录中。
所以我需要做的是将图像下载到“ pics”目录中后转到原始目录,而我完全陷入了困境。
import time
import os
from bs4 import BeautifulSoup as bs
import requests
import praw
import tweepy
def tweet_creator(subreddit_info):
'''Goes through posts on reddit and extracts a shortened link, title & ID'''
post_links = [] #list to store our links
post_titles = [] #list to store our titles
post_ids = [] #list to store our id's
post_imgs = []
print("[bot] extracting posts from sub-reddit")
for submission in subreddit_info.new(limit=5):
if not already_tweeted(submission.id):
post_titles.append(submission.title)
post_links.append(submission.shortlink)
post_ids.append(submission.id)
post_imgs = get_image(submission.url)
print(post_imgs)
else:
print("Already Tweeted")
return post_links, post_titles, post_ids, post_imgs
def already_tweeted(post_id):
'''reads through our .txt file and determines if tweet has already been posted'''
found = 0
print(os.getcwd())
with open(posted_reddit_ids, 'r') as f:
for line in f:
if post_id in line:
found = 1
break
return found
def get_image(img_url):
url = img_url
r = requests.get(url, headers = {'User-Agent' : 'reddit Twitter tool monitoring (by /u/RivianJourneyMan)'})
data = r.text
soup = bs(data, 'lxml')
image_tags = soup.findAll('img')
os.chdir(img_dir)
x = 0
mylist = []
for image in image_tags:
url = image['src']
source = requests.get(url, stream = True)
if source.status_code == 200:
img_file = img_dir + str(x) + '.jpg'
with open(img_file, 'wb') as f:
f.write(requests.get(url).content)
mylist.append(img_file)
f.close()
x += 1
return img_file
else:
mylist.append(None)
return mylist
再次,这不是我的完整代码,只有相关部分,但这是我运行它时输出的内容。如您在alread_tweated函数中所见,我要求它在循环时打印出我所在的目录,以使查看问题更容易。
[bot] Setting up connection with reddit
[bot] extracting posts from sub-reddit
C:\Users\ali\PycharmProjects\cyberbot
pics0.jpg
C:\Users\ali\PycharmProjects\cyberbot\pics
Traceback (most recent call last):
File "C:/Users/ali/PycharmProjects/cyberbot/rtbot_with_img2.py", line 154, in <module>
main()
File "C:/Users/ali/PycharmProjects/cyberbot/rtbot_with_img2.py", line 150, in main
post_links, post_titles, post_ids, post_imgs = tweet_creator(subreddit)
File "C:/Users/ali/PycharmProjects/cyberbot/rtbot_with_img2.py", line 58, in tweet_creator
if not already_tweeted(submission.id):
File "C:/Users/ali/PycharmProjects/cyberbot/rtbot_with_img2.py", line 81, in already_tweeted
with open(posted_reddit_ids, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'oldies.txt'
现在,我一直在寻找无处不在,并尝试使用尽可能多的解决方案返回原始目录,但是它们都不起作用。我已经尝试过os.chdir('..')并将其放在'return mylist []'上方,但这没用。我非常想进入“ pics”目录以下载我的图像,然后我想回到我的原始目录即“ cyberbot”,以便我可以阅读oldies.txt文件以查看是否已经发推文
Perhaps this picture of how my directories and files will help
答案 0 :(得分:0)
注意::有两种解决方法。
PS: subplots_adjust()
应该可以工作
1)获取pics和oldies.txt目录的路径
os.chdir('..')
2)如果您知道oldies.txt的完整路径,则可以按以下方式进行设置,但是,我使用Windows的次数不多,但是我知道文件的路径语法有所不同来自Unix。
img_dir = "/Users/ali/PycharmProjects/cyberbot/pics"
oldies_dir = "/Users/ali/PycharmProjects/cyberbot"
def already_tweeted(post_id):
# begin by changing the directory first
os.chdir(oldies_dir)
...
def get_image(image_url):
os.chdir(img_dir)
...
无论您更改目录多少次,都可以从代码中的任何位置读取文件