tweet_mode ='extended'和text.append(tweet.full_text)不起作用,导致['...']和过滤不起作用

时间:2019-07-28 18:33:13

标签: python tweepy

无法检索完整的推文。最鸣叫结尾为“ ...” 由于API受限制,因此我尝试过滤结果失败。寻找不同的解决方案:

    import tweepy #https://github.com/tweepy/tweepy
    import csv
    import pandas as pd
    # Used for progress bar
    import time
    import sys

    #Twitter API credentials
    consumer_key = ""
    consumer_secret = ""
    access_key = ""
    access_secret = ""

    OAUTH_KEYS = {'consumer_key':consumer_key,         'consumer_secret':consumer_secret,
     'access_token_key':access_key, 'access_token_secret':access_secret}
    auth = tweepy.OAuthHandler(OAUTH_KEYS['consumer_key'],         OAUTH_KEYS['consumer_secret'])
    api = tweepy.API(auth, wait_on_rate_limit=True,         wait_on_rate_limit_notify=True)


    search = tweepy.Cursor(api.search, q='#tips -filter:media', 
                   tweet_mode='extended', 
                   include_rts = False,
                   lang="en").items(2000)

    # Create lists for each field desired from the tweets.
    sn = []
    text = []
    timestamp =[]
    for tweet in search:
        #    print (tweet.user.screen_name, tweet.created_at,         tweet.full_text,)
        print(tweet.full_text)
        timestamp.append(tweet.created_at)
        sn.append(tweet.user.screen_name)
        text.append(tweet.full_text)
        print('-------------------------------------------------------------------------------')

    #Convert lists to dataframe
    df = pd.DataFrame()
    df['timestamp'] = timestamp
    df['sn'] = sn
    df['text'] = text

    # Prepare ford date filtering. Adding an EST time column since chat hosted by people in that time zone.
    df['timestamp'] = pd.to_datetime(df['timestamp'])
    df['EST'] = df['timestamp'] - pd.Timedelta(hours=5) #Convert to EST

    df['EST'] = pd.to_datetime(df['EST'])
    #============================================================================
    # list of timestamp, EST, sn, text  
    col1 = df['timestamp']
    col2 = df['EST']
    col3 = df['sn']
    col4 = df['text']

    # dictionary of lists 
    dict = {'TimeStamp': col1, 'EST': col2, 'SN': col3, 'Text': col4} 
    data = pd.DataFrame(dict) 

    # saving the dataframe 
    data.to_csv('tipstweets.csv') 

代码会打印以“ ...”结尾的推文,但我无法过滤这些推文,我还能尝试使用什么?我在git或twitter文档中找不到解决方案-我应该搜索什么?

1 个答案:

答案 0 :(得分:0)

并非所有tweet对象都包含full_text。如果不存在full_text,请使用text作为备用。