无法在python中打开twitter文本文件

时间:2016-12-16 18:57:20

标签: python json python-3.x twitter tweepy

我收集了一堆用于使用python进行分析的推文。但在尝试打开文本扩展名文件时,我收到了此错误消息。我不知道我收集的推文的架构是否有问题。

JSONDecodeError: Extra data: line 2 column 1 (char 12025)

以下是我编译的代码:

with open ('tweets1.json') as dakota_file:
        dakota_j=json.loads(dakota_file.read())

请参阅代码:

import sys
import jsonpickle
import os

searchQuery = '#Dakota-Access-Pipeline'  # this is what we're searching for
#maxTweets = 10000000 # Some arbitrary large number
maxTweets=6000
tweetsPerQry = 100  # this is the max the API permits
#fName = 'tweets.txt' # We'll store the tweets in a text file.
fName='tweets.json'

# If results from a specific ID onwards are reqd, set since_id to that ID.
# else default to no lower limit, go as far back as API allows
sinceId = None

# If results only below a specific ID are, set max_id to that ID.
# else default to no upper limit, start from the most recent tweet matching the search query.
max_id = -10000000

tweetCount = 0
print("Downloading max {0} tweets".format(maxTweets))
with open(fName, 'w') as f:
    while tweetCount < maxTweets:
        try:
            if (max_id <= 0):
                if (not sinceId):
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry)
                else:
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry,
                                            since_id=sinceId)
            else:
                if (not sinceId):
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry,
                                            max_id=str(max_id - 1))
                else:
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry,
                                            max_id=str(max_id - 1),
                                            since_id=sinceId)
            if not new_tweets:
                print("No more tweets found")
                break
            for tweet in new_tweets:
                f.write(jsonpickle.encode(tweet._json, unpicklable=False) +
                        '\n')
            tweetCount += len(new_tweets)
            print("Downloaded {0} tweets".format(tweetCount))
            max_id = new_tweets[-1].id
        except tweepy.TweepError as e:
            # Just exit if any error
            print("some error : " + str(e))
            break

print ("Downloaded {0} tweets, Saved to {1}".format(tweetCount, fName))

0 个答案:

没有答案