我知道之前曾有人问过这个问题,但我正在努力使代码正常工作。抓取的输出包含“ \ n”,需要删除:
这是我用来抓取的代码:
c.JSON(http.StatusOK, gin.H{"data":
gin.H{
"result": []string{"This is a sample response"},
},
"code": http.StatusOK,
"status": "success",
})
然后听到输出:
import bs4 as bs
import urllib.request
source = urllib.request.urlopen('https://en.wikipedia.org/wiki/List_of_motorway_service_areas_in_the_United_Kingdom#:~:text=Only%2020%20motorway%20services%20in,leases%20to%20private%20operating%20companies.').read()
soup = bs.BeautifulSoup(source,'lxml')
table = soup.table
table_rows = table.find_all('tr')
for tr in table_rows:
td = tr.find_all('td')
row = [i.text for i in td]
print(row)
答案 0 :(得分:0)
for tr in table_rows:
td = tr.find_all('td')
# this will remove '\n' from list and from the end of the parsed results
row = [i.text.strip() for i in td if i.text.strip()]
print(row)