如何保存抓取(抓取,流式传输)结果?

时间:2017-04-20 15:38:03

标签: python twitter web-scraping stream web-crawler

crwaling(抓取,流式传输)结果非常好

离。 973:{' text':' RT @ 1111:hihihihihihi' }

BUT!无法保存。

我该如何解决?

import tweepy
import time
import os
import json
import simplejson

search_term = '5555'
search_term2= '4444'
search_term3='3333'
search_term4='2222'
search_term5='1111'

lat = "11.11"
lon = "11.11"
radius = "100km"


API_key = "0"
API_secret = "0"
Access_token = "0"
Access_token_secret = "0"

location = "%s,%s,%s" % (lat, lon, radius)

auth = tweepy.OAuthHandler(API_key, API_secret)
auth.set_access_token(Access_token, Access_token_secret)

api = tweepy.API(auth)

c=tweepy.Cursor(api.search,
                q="{}+OR+{}".format(search_term, search_term2, search_term3, search_term4, search_term5),
                rpp=1000,
                geocode=location,
                include_entities=True)

data = {}
i = 1
for tweet in c.items():
    data['text'] = tweet.text
    print(i, ":", data)
    time.sleep(0.4)
    i += 1

未生成txt文件。 - >没有错误消息。

或者,制作了txt文件。但是,没有"推文文本和推文日期"在txt中。 - >没有错误消息。

(不一定是txt文件。保存Excel文件。)

wfile = open(os.getcwd()+"/tqtq.txt", mode='w')   
data = {}   
i = 0       

for tweet in c.items():
    data['text'] = tweet.text
    data['date']= tweet.text 
    wfile.write(data['text','date']+'\n')  
    i += 1
    time.sleep(0.4)
wfile.close()

1 个答案:

答案 0 :(得分:0)

您可以尝试使用泡菜

import pickle
pickle.dump(obj, filename)

将其加载回result = pickle.load(filename)