美丽的汤输出不是很可读

时间:2019-05-26 17:59:07

标签: python html twitter python-3.7

我使用Beautiful Soup库创建了一个Twitter刮板。我已经成功地使用他们的用户名检索了Bio和给定用户的热门推文。我唯一遇到的问题是输出有点奇怪,因为输出是从包含许多空行的HTML代码中提取的。

我尝试使用prettify,但是返回的只是一个空行。我也尝试过使用pprint.pprint。

我是python的新手,想不出任何其他方法使我的脚本输出更简洁

任何帮助将不胜感激。

下面是我的脚本:

import requests
from bs4 import BeautifulSoup
import pprint

q = "https://twitter.com"


def find_bio(username):
    c = format("https://twitter.com"+"/" + username)
    r = requests.get(c)
    s = BeautifulSoup(r.text, "html.parser")

    return s.find("div", class_="ProfileHeaderCard").text


def find_toptweet(username):
    c = format("https://twitter.com"+"/" + username)
    r = requests.get(c)
    s = BeautifulSoup(r.text, "html.parser")

    return s.find("div", class_="content").text


if __name__ == "__main__":
    username = input('enter username: ')
    bio = find_bio(username)
    tweet = find_toptweet(username)
    print("Bio--------------------------------------------------------------")
    pprint.pprint(bio)
    print("End of Bio-------------------------------------------------------")
    print('top tweet')
    pprint.pprint(tweet)

以下输出

enter username: altifali4
Bio--------------------------------------------------------------------------------------
('\n'
 '\n'
 'Altif Ali\n'
 '\n'
 '\n'
 '\n'
 '@AltifAli4\n'
 '\n'
 '\n'
 'People, by and large, are good people\n'
 '\n'
 'UoH\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
' \n'
 '    instagram.com/altif.ali\n'
 '  \n'
 '\n'
 '\n'
 '\n'
 '\n'
 'Joined August 2018\n'
 '\n'
 '\n'
 '\n'
 '    Born 1999\n'
 '\n'
 '\n'
 '\n')
End of Bio---------------------------------------------------------------- ----------------------
top tweet
('\n'
 '\n'
 '\n'
 '\n'
 '\n'
 'Lowkey\u200f\xa0@Lowkey0nline\n'
 '\n'
 'May 22\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 'More\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 'Copy link to Tweet\n'
 '\n'
 '\n'
 'Embed Tweet\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 '\n'
 'Power concedes nothing without demand. Without demand power concedes '
 'nothing.\n')

Process finished with exit code 0

1 个答案:

答案 0 :(得分:1)

尝试将if语句替换为以下内容:

if __name__ == "__main__":
    username = input('enter username: ')
    bio = find_bio(username).replace("\n","")
    tweet = find_toptweet(username).replace("\n","")
    print("Bio--------------------------------------------------------------")
    print(bio)
    print("End of Bio-------------------------------------------------------")
    print('top tweet')
    print(tweet)

希望这会有所帮助