我已将beautifulsoup
库用于解析某些网页。因此,我提取文章的查询是:
for i in a.findAll("p"):
print (i.text)
并且,我得到的输出为:
Paragraph 1
Paragraph 2
Paragraph 3
现在,我正在处理多个网页,所有我想将单个网页文章段落作为单个字符串元素追加到列表中。诸如此类:
['Paragraph 1 Paragraph 2 Paragraph 3']
我所做的是:
string_list=[i.text for i in a.findAll("p")]
结果为:
print (string_list)
['Paragraph1','Paragraph2','Paragraph3']
答案 0 :(得分:0)
bs4_p_tags= a.findAll("p")
this_page=[]
for i in bs4_p_tags:
this_page.append(i.text)
common_this_page_para=[]
single_string=" ".join(this_page)# joins the string elements of iterable with single space as separator.
common_this_page_para.append(single_string)
请不要介意变量名太长,这些只是为了便于说明。