我想查找一个Twitter帐户的朋友样本中的所有朋友(意味着Twitter用户正在关注),以查看他们共有的其他朋友。问题是我不知道如何处理受保护的帐户,我一直遇到这个错误:
tweepy.error.TweepError: Not authorized.
这是我的代码:
...
screen_name = ----
file_name = "followers_data/follower_ids-" + screen_name + ".txt"
with open(file_name) as file:
ids = file.readlines()
num_samples = 30
ids = [x.strip() for x in ids]
friends = [[] for i in range(num_samples)]
for i in range(0, num_samples):
id = random.choice(ids)
for friend in tweepy.Cursor(api.friends_ids, id).items():
print(friend)
friends[i].append(friend)
我列出了来自一个帐户screen_name
的所有朋友,我从中加载了朋友ID。然后我想抽样一些并查看他们的朋友。
我也尝试过这样的事情:
def limit_handled(cursor, name):
try:
yield cursor.next()
except tweepy.TweepError:
print("Something went wrong... ", name)
pass
for i in range(0, num_samples):
id = random.choice(ids)
items = tweepy.Cursor(api.friends_ids, id).items()
for friend in limit_handled(items, id):
print(friend)
friends[i].append(friend)
但是,在进入下一个样本之前,似乎每个样本朋友只有一个朋友被存储。我对Python和Tweepy都很陌生,所以如果有什么看起来很奇怪,请告诉我。
答案 0 :(得分:0)
首先,关于命名的几点评论。名称file
和id
受到保护,因此您应该避免使用它们来命名变量 - 我已对其进行了更改。
其次,当您初始化tweepy API时,如果您使用wait_on_rate_limit=True
,它就足以处理速率限制,如果您使用wait_on_rate_limit_notify=True
,则会因速率限制而延迟通知您。
当您设置friends = [[] for i in range(num_samples)]
时,您也会丢失一些信息,因为您无法将找到的朋友与他们相关的帐户相关联。您可以使用字典,它将使用的每个ID与找到的朋友相关联,以便更好地处理。
我的更正代码如下:
import tweepy
import random
consumer_key = '...'
consumer_secret = '...'
access_token = '...'
access_token_secret = '...'
# OAuth process, using the keys and tokens
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# Creation of the actual interface, using authentication. Use rate limits.
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
screen_name = '----'
file_name = "followers_data/follower_ids-" + screen_name + ".txt"
with open(file_name) as f:
ids = [x.strip() for x in f.readlines()]
num_samples = 30
friends = dict()
# Initialise i
i = 0
# We want to check that i is less than our number of samples, but we also need to make
# sure there are IDs left to choose from.
while i <= num_samples and ids:
current_id = random.choice(ids)
# remove the ID we're testing from the list, so we don't pick it again.
ids.remove(current_id)
try:
# try to get friends, and add them to our dictionary value if we can
# use .get() to cope with the first loop.
for page in tweepy.Cursor(api.friends_ids, current_id).pages():
friends[current_id] = friends.get(current_id, []) + page
i += 1
except tweepy.TweepError:
# we get a tweep error when we can't view a user - skip them and move onto the next.
# don't increment i as we want to replace this user with someone else.
print 'Could not view user {}, skipping...'.format(current_id)
输出是一个字典friends
,其中包含用户ID和每个用户的朋友项目。