output file of the code. 我在输出文本文件中将time_zone和UTC_offset的值设置为null。我需要为每条推文提供一个非null且不同的值,如果推文发布的话在印度的一些人,该推文的UTC_offset和time_zone将与在美国发布的推文不同,我需要那些非空值。如果我使用dataframe.to_json转换,我得到的tweet.created_at格式不正确,但它是如果我使用dataframe.to_csv转换,情况并非如此。有人可以解释一下吗? P.S-初学者在python和tweepy
import pandas as pd
from datetime import datetime, date, time, timedelta
import json
from dateutil.tz import tzoffset
# Variables that contains the user credentials to access Twitter API
consumer_key = 'mine'
consumer_secret = 'mine'
access_token = 'mine'
access_token_secret = 'mine`enter code here`'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
# Open/Create a file to append data
#csvFile = open('tweet.txt', 'a')
#Use csv Writer
#csvWriter = csv.writer(csvFile,delimiter=',')
results=[]
for tweet in tweepy.Cursor(api.search, q="Depression" or "Anxiety",lang="en").items(30):
if (not tweet.retweeted) and ('RT @' not in tweet.text):
results.append(tweet)
def tweets_df(results):
id_list = [tweet.id for tweet in results]
data_set = pd.DataFrame(id_list, columns=["id"])
data_set["text"] = [tweet.text for tweet in results]
data_set["source"] = [tweet.source for tweet in results]
data_set["screen_name"] = [tweet.user.screen_name for tweet in results]
#localtime_tz = tzoffset(user.time_zone, user.utc_offset)
data_set["created_at"] = [tweet.user.created_at for tweet in results]
# data_set["place"]=[tweet.place for tweet in results]
data_set["location"] = [tweet.user.location for tweet in results]
data_set["UTC_Offset"] = [tweet.user.utc_offset for tweet in results]
data_set["timezone"] = [tweet.user.time_zone for tweet in results]
# data_set["year"] = [tweet.created_at.year for tweet in results]
# data_set["month"] = [tweet.created_at.month for tweet in results]
#data_set["day"] = [tweet.created_at.day for tweet in results]
#data_set["hour"] = [tweet.created_at.hour for tweet in results]
return data_set
data_set = tweets_df(results)
#data_set.to_csv("/home/rajneeshkaushal/Documents/Pycharm/hived/tweet_data.txt",header=None)
out = data_set.to_json(orient='records')[1:-1].replace('},{', '} {')
with open('test.txt', 'w') as f:
f.write(out)
答案 0 :(得分:1)
我猜这个链接部分回答了您的问题:https://twittercommunity.com/t/upcoming-changes-to-the-developer-platform/104603
根据新的新欧盟隐私法,Twitter用户对象的时区价值将在5月23日之后成为私人领域。