获取time_zone和utc_offset的空值

时间:2018-06-13 06:09:22

标签: python tweepy

output file of the code. 我在输出文本文件中将time_zone和UTC_offset的值设置为null。我需要为每条推文提供一个非null且不同的值,如果推文发布的话在印度的一些人,该推文的UTC_offset和time_zone将与在美国发布的推文不同,我需要那些非空值。如果我使用dataframe.to_json转换,我得到的tweet.created_at格式不正确,但它是如果我使用dataframe.to_csv转换,情况并非如此。有人可以解释一下吗? P.S-初学者在python和tweepy

import pandas as pd
from datetime import datetime, date, time, timedelta
import json
from dateutil.tz import tzoffset

# Variables that contains the user credentials to access Twitter API
consumer_key = 'mine'
consumer_secret = 'mine'
access_token = 'mine'
access_token_secret = 'mine`enter code here`'

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)


api = tweepy.API(auth)
# Open/Create a file to append data
#csvFile = open('tweet.txt', 'a')
  #Use csv Writer
  #csvWriter = csv.writer(csvFile,delimiter=',')
  results=[]

for tweet in tweepy.Cursor(api.search, q="Depression" or "Anxiety",lang="en").items(30):
 if (not tweet.retweeted) and ('RT @' not in tweet.text):
 results.append(tweet)


 def tweets_df(results):
     id_list = [tweet.id for tweet in results]
     data_set = pd.DataFrame(id_list, columns=["id"])
     data_set["text"] = [tweet.text for tweet in results]
     data_set["source"] = [tweet.source for tweet in results]
     data_set["screen_name"] = [tweet.user.screen_name for tweet in results]

     #localtime_tz = tzoffset(user.time_zone, user.utc_offset)
     data_set["created_at"] = [tweet.user.created_at for tweet in results]
    # data_set["place"]=[tweet.place for tweet in results]
     data_set["location"] = [tweet.user.location for tweet in results]
     data_set["UTC_Offset"] = [tweet.user.utc_offset for tweet in results]
     data_set["timezone"] = [tweet.user.time_zone for tweet in results]
    # data_set["year"] = [tweet.created_at.year for tweet in results]
    # data_set["month"] = [tweet.created_at.month for tweet in results]
     #data_set["day"] = [tweet.created_at.day for tweet in results]
     #data_set["hour"] = [tweet.created_at.hour for tweet in results]
     return data_set


 data_set = tweets_df(results)
 #data_set.to_csv("/home/rajneeshkaushal/Documents/Pycharm/hived/tweet_data.txt",header=None)
 out = data_set.to_json(orient='records')[1:-1].replace('},{', '} {')
 with open('test.txt', 'w') as f:
     f.write(out)

1 个答案:

答案 0 :(得分:1)

我猜这个链接部分回答了您的问题:https://twittercommunity.com/t/upcoming-changes-to-the-developer-platform/104603

根据新的新欧盟隐私法,Twitter用户对象的时区价值将在5月23日之后成为私人领域。