我是一个Python新手,我正在编写一段代码来收集来自两个邻居的推文,以JSON格式保存它们并绘制一些数据。您对如何提取推文日期和时间,或者如何计算邻居的推文数量有任何建议吗?换句话说,将.txt的不同变量与JSON数据隔离成可绘制的东西的最佳方法是什么?
非常感谢!
from twitter import *
import sys
import os.path
import simplejson as json
import tweepy
import csv
#log into Twitter
OAUTH_TOKEN = 'XXX'
OAUTH_SECRET = 'XXX'
CONSUMER_KEY = 'XXX'
CONSUMER_SECRET = 'XXX'
t = Twitter(auth=OAuth(OAUTH_TOKEN, OAUTH_SECRET, CONSUMER_KEY, CONSUMER_SECRET))
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
#I consider A to be a 2km radius circle
result_A = t.search.tweets(query="kaffe",geocode="55.662610,12.604074,1.24mi",result_type='recent')
with open('data_A.txt', 'w') as outfile:
json.dump(result_A, outfile)
#Similarly, B is a 2km radius circle
result_B = t.search.tweets(query="kaffe",geocode="55.694700,12.548283,1.24mi",result_type='recent')
with open('data_B.txt', 'w') as outfile:
json.dump(result_B, outfile)
答案 0 :(得分:0)
如果您在Python中需要更高级的电子表格功能,则可能需要查看http://manns.github.io/pyspread/。
此外,SQLite3内置为sqlite3
模块:只需创建一个内存数据库和SQL。
但是,这里有一些关于如何在纯Python中执行基本操作的示例:
import json
import urllib
from datetime import datetime
data = urllib.urlopen('https://dl.dropboxusercontent.com/u/2684973/data_A.txt').read()
data = json.loads(data)
tweets = data['statuses']
def parse_created(timestamp):
_, m, d, t, _, y = timestamp.split(' ')
return datetime.strptime('%s %s %s %s' % (m, d, t, y), '%b %d %H:%M:%S %Y')
tweets_data = [(x['user']['name'], x['text'], parse_created(x['created_at']))
for x in tweets]
tweets_data
现在包含这些列的“表格”,如果您需要这种格式(例如用于绘图);或者:
erik_tweets = [x for x in tweets
if x['user']['name'] == 'Erik Allik']
或:
erik_tweets_before_today = [
x for x in tweets
if x['user']['name'] == 'Erik Allik'
and x['created_at'].date() < datetime.date.today()
]