Question

我是一个Python新手，我正在编写一段代码来收集来自两个邻居的推文，以JSON格式保存它们并绘制一些数据。您对如何提取推文日期和时间，或者如何计算邻居的推文数量有任何建议吗？换句话说，将.txt的不同变量与JSON数据隔离成可绘制的东西的最佳方法是什么？

非常感谢！

from twitter import *

import sys
import os.path
import simplejson as json
import tweepy
import csv

#log into Twitter
OAUTH_TOKEN = 'XXX'
OAUTH_SECRET = 'XXX'
CONSUMER_KEY = 'XXX'
CONSUMER_SECRET = 'XXX'

t = Twitter(auth=OAuth(OAUTH_TOKEN, OAUTH_SECRET, CONSUMER_KEY, CONSUMER_SECRET))
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)

#I consider A to be a 2km radius circle
result_A = t.search.tweets(query="kaffe",geocode="55.662610,12.604074,1.24mi",result_type='recent')
with open('data_A.txt', 'w') as outfile:
     json.dump(result_A, outfile)

#Similarly, B is a 2km radius circle 
result_B = t.search.tweets(query="kaffe",geocode="55.694700,12.548283,1.24mi",result_type='recent')
with open('data_B.txt', 'w') as outfile:
     json.dump(result_B, outfile)

Answer 1

如果您在Python中需要更高级的电子表格功能，则可能需要查看http://manns.github.io/pyspread/。

此外，SQLite3内置为sqlite3模块：只需创建一个内存数据库和SQL。

但是，这里有一些关于如何在纯Python中执行基本操作的示例：

import json
import urllib
from datetime import datetime

data = urllib.urlopen('https://dl.dropboxusercontent.com/u/2684973/data_A.txt').read()
data = json.loads(data)

tweets = data['statuses']

def parse_created(timestamp):
    _, m, d, t, _, y = timestamp.split(' ')
    return datetime.strptime('%s %s %s %s' % (m, d, t, y), '%b %d %H:%M:%S %Y')

tweets_data = [(x['user']['name'], x['text'], parse_created(x['created_at']))
               for x in tweets]

tweets_data现在包含这些列的“表格”，如果您需要这种格式（例如用于绘图）;或者：

erik_tweets = [x for x in tweets
               if x['user']['name'] == 'Erik Allik']

或：

erik_tweets_before_today = [
    x for x in tweets
    if x['user']['name'] == 'Erik Allik'
    and x['created_at'].date() < datetime.date.today()
]

如何保存和处理在Python中保存为JSON的推文

1 个答案: