我将表的内容放在包含该代码的列表中:
soup = BeautifulSoup(html_doc,"html.parser")
for h1 in soup.find_all('h1'):
print (h1.get_text())
for h2 in soup.find_all('h2'):
print (h2.get_text())
restricted_webpage= soup.find( "div", {"id":"ingredients"} )
readable_restricted=str(restricted_webpage)
soup2=BeautifulSoup(readable_restricted,"html.parser")
rows=list()
for td in soup2.find_all('td'):
rows.append(str(td.get_text()))
print(rows)
结果受到那些\ n HTML_Doc可以是found here。['\n Cendres brutes (%)\n ', '\n 7.4\n ', '\n Cellulose brute (%)\n ', '\n 1.6\n ', '\n Fibres alimentaires (%)\n ', '\n 6.6\n ', '\n Matière grasse (%)\n ', '\n 16.0\n ', '\n Acide linoléique (%)\n ', '\n 3.1\n ', '\n Energie métabolisable (calculée selon NRC85) (kcal/kg)\n ', '\n 3652.5\n ', '\n Energie métabolisable (mesurée) (kcal/kg)\n ', '\n 3900.0\n ', '\n Humidité (%)\n ', '\n 9.5\n ', '\n Extrait non azoté (%)\n ', '\n 40.5\n ', '\n Oméga 6 (%)\n ', '\n 3.18\n ', '\n Protéine brute (%)\n ', '\n 25.0\n ', '\n Amidon (%)\n ', '\n 35.5\n ', '\n Chlore (%)\n ', '\n 1.43\n ', '\n Cuivre (mg/kg)\n ', '\n 15.0\n ', '\n Iode (mg/kg)\n ', '\n 2.9\n ', '\n Fer (mg/kg)\n ', '\n 167.0\n ', '\n Manganèse (mg/kg)\n ', '\n 68.0\n ', '\n Zinc (mg/kg)\n ', '\n 242.0\n ', '\n Biotine (mg/kg)\n ', '\n 3.13\n ', '\n Choline (mg/kg)\n ', '\n 1600.0\n ', '\n Acide folique (mg/kg)\n ', '\n 13.9\n ', '\n Vitamine A (UI/kg)\n ', '\n 32000.0\n ', '\n Vitamine B1 Thiamine (mg/kg)\n ', '\n 27.5\n ', '\n Vitamine B2 Riboflavine (mg/kg)\n ', '\n 49.6\n ', '\n Vitamine B3 Niacine (mg/kg)\n ', '\n 490.0\n ', '\n Vitamine B5 Acide pantothénique (mg/kg)\n ', '\n 147.8\n ', '\n Vitamine B6 Pyridoxine (mg/kg)\n ', '\n 77.1\n ', '\n Vitamine C (mg/kg)\n ', '\n 200.0\n ', '\n Vitamine D3 (UI/kg)\n ', '\n 800.0\n ', '\n Vitamine E (mg/kg)\n ', '\n 600.0\n ', '\n Arginine (%)\n ', '\n 1.53\n ', '\n Lutéine (mg/kg)\n ', '\n 5.0\n ', '\n Méthionine Cystine (%)\n ', '\n 1.18\n ', '\n Taurine (mg/kg)\n ', '\n 2900.0\n ']
答案 0 :(得分:1)
get_text()
已剥离内置:
td.get_text(strip=True)
答案 1 :(得分:0)
以下内容可以解决您的问题:
map(str.strip, rows)
正如Padraic Cunningham所说,你也可以在str.strip
电话上直接使用td.get_text()
方法:
rows=list()
for td in soup2.find_all('td'):
rows.append(td.get_text().strip())
使用列表理解的替代结果:
rows = [td.get_text().strip() for td in soup2.find_all('td')]