这是一个Twitter
抓取代码,用于提取包含着名关键字的推文。
我想每12小时重复下面的整个代码。 (或12小时+10分钟休息)。你可以给我重复短语的建议吗?
import tweepy
import time
import os
import json
import simplejson
search_term = 'word1'
search_term2= 'word2'
search_term3='word3'
lat = "xxxx"
lon = "xxxx"
radius = "xxxx"
location = "%s,%s,%s" % (lat, lon, radius)
API_key = "xxxx"
API_secret = "xxxx"
Access_token = "xxxx"
Access_token_secret = "xxxx"
auth = tweepy.OAuthHandler(API_key, API_secret)
auth.set_access_token(Access_token, Access_token_secret)
api = tweepy.API(auth)
c=tweepy.Cursor(api.search,
q="{}+OR+{}".format(search_term, search_term2, search_term3),
rpp=1000,
geocode=location,
include_entities=True)
data = {}
i = 1
for tweet in c.items():
data['text'] = tweet.text
print(i, ":", data)
i += 1
time.sleep(1)
wfile = open(os.getcwd()+"/workk2.txt", mode='w')
data = {}
i = 0
for tweet in c.items():
data['text'] = tweet.text
wfile.write(data['text']+'\n')
i += 1
wfile.close()
答案 0 :(得分:1)
您可以设置一个每12小时执行一次脚本的Cron作业。为此,您应该使用.py
扩展名保存脚本并使其可执行。然后将其添加到您的crontab
:
0 0 0/12 * * ? /usr/bin/python yourscript.py
有关详细信息,请查看this问题。或者,python中的包(例如APScheduler)可以帮助您实现这一目标。在APScheduler中,您可以定义这样的工作:
from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
@sched.scheduled_job('interval', hours=12)
def timed_job():
print('This job is run every 12 hours.')
sched.configure(options_from_ini_file)
sched.start()