Question

我试图使用beautifulSoup进行一些网页抓取，我希望能够使用element = ["currency_rate", "CU", "AU", "AG", "MO", "Smaeltloen", "Hg", "Laegesvaerde"] colu = [27, 32, 33, 34, 35, 36, 37, 38] # column number i = 0 while i < len(element) + 1: h = "Payable_"+element[i] vars()[h] = h = readexcel_column(start, end, colu[i]) print(h) i = i+1方法使用CSS :nth-child()过滤器。这个功能有没有实现？有没有更好的方法来使用beautifulSoup提取特定元素？

.select()

我知道有# import dependencied from bs4 import BeautifulSoup import requests import json def getSoup(url): # raw data source_code = requests.get(url) # convert to text plain_text = source_code.text # lxml format soup = BeautifulSoup(plain_text, 'lxml') return soup # get data from site baseUrl = "https://stackoverflow.com/questions/" questionId = 48139550 # create our URL url = baseUrl + postId try: page_soup = getSoup(url) poi = page_soup.select('#sidebar > div.module.community-bulletin > div > div:nth-child(4) > div.bulletin-item-content > a') print(poi) except Exception as e: print(e)方法，但它不那么直观.. 关于:nth-of-type()的任何想法？

使用：beautifulSoup中的nth-child（）css过滤器

0 个答案: