在while循环中添加到熊猫df

时间:2018-12-09 04:58:29

标签: python json pandas

我有一个名为full_senator_df的df:

    Official Twitter    Senator         party
0   SenShelby           Richard Shelby  Republican
1   lisamurkowski       Lisa Murkowski  Republican
2   SenDanSullivan      Dan Sullivan    Republican

我已经编写了一些代码来使用这些数据来检索每个参议员的推文。是否可以将结果追加到表中或作为json而不是当前正在执行的打印来获取结果?

senator_count = 0
num_senators = len(full_senator_df.index)

while senator_count <= num_senators:
    senator_official_twitter = full_senator_df['Official Twitter'][senator_count]
    tweets = api.user_timeline(screen_name = senator_official_twitter, count = tweet_num, include_rts = True)

    for status in tweets:
        print(full_senator_df['Senator'][senator_count], status.text, full_senator_df['party'][senator_count])

    senator_count += 1

Current Output here

2 个答案:

答案 0 :(得分:0)

下面的代码创建一个新的数据框(表),其中每个参议员每方的推文都有

# Create an empty dataframe stub to append to later
all_tweets_df = pd.DataFrame(columns=['Senator', 'Party', 'Tweet'])

# Iterate over the initial dataframe
for _, row in full_senator_df.iterrows():
    tweets = api.user_timeline(screen_name = row['Official Twitter'],
                               count = tweet_num,
                               include_rts = True)
    senator_tweets_df = pd.DataFrame({'Senator': row['Senator'],
                                      'Party': row['party'],
                                      'Tweet': tweets})
    # Append to the output
    all_tweets_df = pd.concat([all_tweets_df, senator_tweets_df], sort=True)

输出应该类似于

        Party    Senator   Tweet
0  Republican     Shelby  tweet1
1  Republican     Shelby  tweet2
2  Republican     Shelby  tweet3
0  Republican  Murkowski  tweet1
1  Republican  Murkowski  tweet2
2  Republican  Murkowski  tweet3
0  Republican   Sullivan  tweet1
1  Republican   Sullivan  tweet2
2  Republican   Sullivan  tweet3

答案 1 :(得分:0)

我认为你快到了。如果要保持循环,则可以打印该数据而不是打印该数据到数据框中。首先定义一个新的数据框

dfTweets = pd.DataFrame() # place this before your while loop
row_num = 0
while ...
...
    for status in tweets:
        dfTweets.loc[0, row_num] = full_senator_df['Senator'][senator_count]
        dfTweets.loc[1, row_num] = status.text, 
        dfTweets.loc[2, row_num] = full_senator_df['party'][senator_count]
        row_num += 1

dfTweets.columns = ["Senator", "tweet_text"]