Question

我已将beautifulsoup库用于解析某些网页。因此，我提取文章的查询是：

for i in a.findAll("p"):
      print (i.text)

并且，我得到的输出为：

Paragraph 1
Paragraph 2
Paragraph 3

现在，我正在处理多个网页，所有我想将单个网页文章段落作为单个字符串元素追加到列表中。诸如此类：

['Paragraph 1 Paragraph 2 Paragraph 3']

我所做的是：

string_list=[i.text for i in a.findAll("p")]

结果为：

print (string_list)
['Paragraph1','Paragraph2','Paragraph3']

Answer 1

bs4_p_tags= a.findAll("p")
this_page=[]
for i in bs4_p_tags:
    this_page.append(i.text)
common_this_page_para=[]
single_string=" ".join(this_page)# joins the string elements of iterable with single space as separator. 
common_this_page_para.append(single_string)

请不要介意变量名太长，这些只是为了便于说明。

将多个字符串作为单个字符串添加到列表中

1 个答案: