需要帮助格式化我的Python web scrape。无论出于何种原因,当我得到我需要的信息时,似乎这些单词已被标记为不合适而不确定如何修复它。
感谢任何帮助
由于
import requests
from bs4 import BeautifulSoup
r = requests.get("http://www.canadianappliance.ca/Refrigerators-And-Fridges-3/Full-Size-Refrigerators-38/French-Door-Refrigerators-48/?per_page=all")
r.content
soup = BeautifulSoup(r.content)
g_data = soup.find_all("h2", {"class": "product_link"})
for item in g_data:
print (item.text)
答案 0 :(得分:1)
使用.get_text()
提供strip
参数。另外,用空格替换换行符:
g_data = soup.find_all("h2", {"class": "product_link"})
for item in g_data:
print(item.get_text(strip=True).replace("\n", " "))
打印:
Samsung - RF220NCTASR
Samsung - RF18HFENBSR
Samsung - RF23HCEDBSR
...
Haier - HRF15N3AGS
GE Profile - PWE23KMKES