我想使用python分析twitter数据(JSON文件),这是脚本:
import json
fname = 'analisis.json' with open(fname, 'r') as f:
users_with_geodata = {
"data": []
}
all_users = []
total_tweets = 0
geo_tweets = 0
for line in f:
tweet = json.loads(line)
if tweet['user']['id']:
total_tweets += 1
user_id = tweet['user']['id']
if user_id not in all_users:
all_users.append(user_id)
#Give users some data to find them by. User_id listed separately
# to make iterating this data later easier
user_data = {
"user_id" : tweet['user']['id'],
"features" : {
"name" : tweet['user']['name'],
"id": tweet['user']['id'],
"screen_name": tweet['user']['screen_name'],
"tweets" : 1,
"location": tweet['user']['location'],
}
}
#Iterate through different types of geodata to get the variable primary_geo
if tweet['coordinates']:
user_data["features"]["primary_geo"] = str(tweet['coordinates'][tweet['coordinates'].keys()[1]][1]) + ", " + str(tweet['coordinates'][tweet['coordinates'].keys()[1]][0])
user_data["features"]["geo_type"] = "Tweet coordinates"
elif tweet['place']:
user_data["features"]["primary_geo"] = tweet['place']['full_name'] + ", " + tweet['place']['country']
user_data["features"]["geo_type"] = "Tweet place"
else:
user_data["features"]["primary_geo"] = tweet['user']['location']
user_data["features"]["geo_type"] = "User location"
#Add only tweets with some geo data to .json. Comment this if you want to include all tweets.
if user_data["features"]["primary_geo"]:
users_with_geodata['data'].append(user_data)
geo_tweets += 1
#If user already listed, increase their tweet count
elif user_id in all_users:
for user in users_with_geodata["data"]:
if user_id == user["user_id"]:
user["features"]["tweets"] += 1
#Count the total amount of tweets for those users that had geodata
for user in users_with_geodata["data"]:
geo_tweets = geo_tweets + user["features"]["tweets"]
#Get some aggregated numbers on the data
print ("The file included ") + str(len(all_users)) + (" unique users who tweeted with or without geo data")
print ("The file included ") + str(len(users_with_geodata['data'])) + (" unique users who tweeted with geo data, including 'location'")
print ("The users with geo data tweeted ") + str(geo_tweets) + (" out of the total ") + str(total_tweets) + (" of tweets.")
with open('analisis_geo.json', 'w') as fout:
fout.write(json.dumps(users_with_geodata, indent=4))
当我在python 3.6.1中运行时,会出现错误消息: 。但是在python 2.7.13中,脚本运行良好,如下所示: 。有谁知道如何使脚本与python 3.6.1兼容?
答案 0 :(得分:0)
Python 3 dict.keys()
返回不同类型的对象。要按索引访问密钥,您需要将其强制转换为list
。所以
keys = list(my_dict.keys())[0]
你也可以像在for
循环中那样遍历键,但这对你来说并没有什么帮助。
for key in my_dict.keys():
# do things