如何将文本连接到列表中的项目

时间:2019-04-11 15:40:54

标签: python web-scraping concatenation

如何在“ bullets_text”列表变量的每一行中添加<li>标签和</li>

当前结果:

24.2MP APS-C CMOS Sensor DIGIC 6 Image Processor
3.0" 1.04m-Dot Vari-Angle Touchscreen Full HD 1080p Video Recording at 60 fps

所需结果:

<li>24.2MP APS-C CMOS Sensor</li> <li>DIGIC 6 Image Processor</li>
<li>3.0" 1.04m-Dot Vari-Angle Touchscreen</li> <li>Full HD 1080p Video
Recording at 60 fps</li>

当前代码:

from bs4 import BeautifulSoup
import urllib.request
import pandas as pd


def get_bullets(urls):

  urls = urls.split(",")
  dfs = []
  for url in urls:
          page = urllib.request.urlopen(url)
          soup = BeautifulSoup(page,'lxml')
          sku = url.split('/')[5]
          content = soup.find('div', class_='js-productHighlights product-highlights c28 fs14 js-close')
          bullets = content.find_all('li', class_='top-section-list-item')        
          bullets_text = '\n'.join([ bullet.text for bullet in bullets ])
          temp_df = pd.DataFrame([[sku, bullets_text]], columns = ['sku','bullets'])
          dfs.append(temp_df)
  df = pd.concat(dfs, ignore_index=True)
  df.to_csv('book2.csv', index=False)

get_bullets(input('enter url'))

用户输入:https://www.bhphotovideo.com/c/product/1225875-REG/canon_1263c004_eos_80d_dslr_camera.html

1 个答案:

答案 0 :(得分:1)

<li>标签连接到列表中的项目。 将这段代码修改为:

bullets_text = '\n'.join([ "<li>"+bullet.text+"</li>" for bullet in bullets ])

就这样。

输出:

<li>24.2MP APS-C CMOS Sensor</li>
<li>DIGIC 6 Image Processor</li>
<li>3.0" 1.04m-Dot Vari-Angle Touchscreen</li>
<li>Full HD 1080p Video Recording at 60 fps</li>
<li>45-Point All Cross-Type AF System</li>
<li>Dual Pixel CMOS AF</li>
<li>Expanded ISO 25600, Up to 7 fps Shooting</li>
<li>Built-In Wi-Fi with NFC</li>
<li>RGB+IR 7560-Pixel Metering Sensor</li>