我试图显示我从链接中刮取的产品标题和价格,并给它们编号。到目前为止,这是我的代码。但是,我想在同一行上获得计数号和产品名称,而在另一行上获得产品价格。我该如何修改我的代码?
import requests
from bs4 import BeautifulSoup
url = 'https://scrapingclub.com/exercise/list_basic/?page=1'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')
items = soup.find_all('div', class_='col-lg-4 col-md-6 mb-4')
count = 1
for i in items:
productName = i.find('h4', class_='card-title').text
productPrice = i.find('h5').text
count = count + 1
print(str(count) + '. ' + productName + 'Price: ' + productPrice)
答案 0 :(得分:2)
现在的问题是产品以java.security.auth.login.config
换行符开头和结尾。
要摆脱这一点,我们可以使用'\n'
方法。另外,对于打印语句,由于产品不以换行符(strip()
)结尾,因此我们需要将... + 'Price' + ...
更改为... + '\nPrice' + ...
。
'\n'
这看起来像是一个有趣的网络抓取项目!希望对您有帮助。
答案 1 :(得分:0)
尝试的一种方式:
import requests
from bs4 import BeautifulSoup
url = 'https://scrapingclub.com/exercise/list_basic/?page=1'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')
items = soup.find_all('div', class_='col-lg-4 col-md-6 mb-4')
count = 1
for i in items:
productName = i.find('h4', class_='card-title').get_text(strip=True)
productPrice = i.find('h5').get_text(strip=True)
count = count + 1
print(str(count) + '. ' + productName + '\nPrice: ' + productPrice)
输出:
2. Short Dress
Price: $24.99
3. Patterned Slacks
Price: $29.99
4. Short Chiffon Dress
Price: $49.99
5. Off-the-shoulder Dress
Price: $59.99
6. V-neck Top
Price: $24.99
7. Short Chiffon Dress
Price: $49.99
8. V-neck Top
Price: $24.99
9. V-neck Top
Price: $24.99
10. Short Lace Dress
Price: $59.99
答案 2 :(得分:0)
这是您可以做的事的一个例子,以作为启发:)
import requests, json
from bs4 import BeautifulSoup
url = 'https://scrapingclub.com/exercise/list_basic/?page=1'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'lxml')
items = soup.find_all('div', class_='col-lg-4 col-md-6 mb-4')
#A message that uses the array to number the items
print('the scraper found', len(items), 'items on the page')
#An empty array to append to
obj = []
#A function to make it easier to calculate with numbers
def getPrice(text):
if text[0] == '$':
return {'$':float(text[1:])}
#Append all the json items to the list
for i in range(len(items)):
obj.append(
{
'productName' : items[i].find('h4', class_='card-title').text.strip(),
'productPrice' : getPrice(items[i].find('h5').text.strip()),
}
)
print(json.dumps(obj, indent=2))