Python如何从列表中的字符串中删除字符

时间:2016-03-06 20:27:03

标签: python string list character strip

我已经玩了很长时间了。我想从each_div变量返回的值中替换一串文本,该变量从网页返回一大堆已解析的值。

def scrape_page():
    create_dir(project_dir)
    page = 1
    max_page = 10
    while page < max_page:
        page = page + 1
        for each_div in soup.find_all('div',{'class':'username'}):
            f.write(str(each_div) + "\n")

如果我运行此代码,它将从html页面解析用户名类中的数据。问题是它返回它:

<div class="username">someone_s_username</div>

我一直在试图解决的问题是剥离<div class="username"></div>部分,因此它只返回实际的用户名而不是html。如果有人知道如何做到这一点,那就太棒了,谢谢你

1 个答案:

答案 0 :(得分:1)

当然,您可以使用Python的替换方法:

for each_div in soup.find_all('div',{'class':'username'}):
    each_div = each_div.replace('''<div class="username">''',"")
    each_div = each_div.replace("</div>","")
    f.write(str(each_div) + "\n")

或者,您可以拆分字符串以获取所需的部分:

for each_div in soup.find_all('div',{'class':'username'}):
    each_div = each_div.split(">")[1]  # everything after the first ">"
    each_div = each_div.split("<")[0]  # everything before the other "<"
    f.write(str(each_div) + "\n")

哦,我记得,我相信你能够做到这一点:

for each_div in soup.find_all('div',{'class':'username'}):
    f.write(str(each_div.text) + "\n")