因此,我在过滤Praw的结果时遇到了一些问题。我想在结果中排除诸如[[request],[off topic]或[nsfw])之类的关键字。我不希望在prawy上发布praw的结果中包含的此类帖子。我正在寻找文档,但在PRAW网站上找不到任何内容。
这是我的代码:
def poster():
conn = sqlite3.connect('jb_id.db')
c = conn.cursor()
toTweet = []
for submission in reddit.subreddit(SUB).hot(limit=POST_LIMIT):
if not submission.stickied and len(submission.title) < 255:
url = submission.shortlink
title = submission.title
udate = time.strftime("%Y-%m-%d %X",time.gmtime(submission.created_utc))
try:
# This keeps a record of the posts in a the database
c.execute("INSERT INTO posts (id, title, udate) VALUES (?, ?, ?)",
(url, title, udate))
conn.commit()
message = title + " " + url
print(message)
toTweet.append(message)
except sqlite3.IntegrityError:
# This means the post was already tweeted and is ignored
print("Duplicate", url)
c.close()
conn.close()
tweeter(toTweet)
如您在此处看到的,我排除了超过255个字符的标题和标题。我想知道是否有一种方法可以过滤上面我在praw的结果中提到的关键字在reddit上的帖子。谢谢!
答案 0 :(得分:0)
列出不应包含在提交标题中的关键字列表
bad_keywords = "[request]", "[off topic]", "[nsfw]"
如果提交的标题包含列表中的项目,请跳过循环
title_lowercase = submission.title.lower()
if any(x in title_lowercase for x in bad_keywords):
continue
我会将其与您的其他排除项结合使用,以减少缩进并使其更具可读性
bad_title = any(x in title_lowercase for x in bad_keywords)
skip_submission = submission.stickied and len(submission.title) > 255 and bad_title
if skip_submission:
continue
完整的解决方案
def poster():
conn = sqlite3.connect('jb_id.db')
c = conn.cursor()
toTweet = []
bad_keywords = "[request]", "[off topic]", "[nsfw]"
for submission in reddit.subreddit(SUB).hot(limit=POST_LIMIT):
title = submission.title
title_lowercase = title.lower()
bad_title = any(x in title_lowercase for x in bad_keywords)
skip_submission = submission.stickied and len(submission.title) > 255 and bad_title
if skip_submission:
continue
url = submission.shortlink
udate = time.strftime("%Y-%m-%d %X",time.gmtime(submission.created_utc))
try:
# This keeps a record of the posts in a the database
c.execute("INSERT INTO posts (id, title, udate) VALUES (?, ?, ?)",
(url, title, udate))
conn.commit()
message = title + " " + url
print(message)
toTweet.append(message)
except sqlite3.IntegrityError:
# This means the post was already tweeted and is ignored
print("Duplicate", url)
c.close()
conn.close()
tweeter(toTweet)