我从python中的BeautifulSoup开始,我想从Android Play商店中抓取,包装名称和页面中每个应用的价格。
要获取程序包名称,我使用以下代码:
App
这是HTML源代码的一部分:
url = "https://play.google.com/store/apps/category/NEWS_AND_MAGAZINES/collection/topselling_paid"
response = get(url)
html_soup = BeautifulSoup(response.text, 'html.parser')
app_container = html_soup.find_all('div', class_="card no-rationale square-cover apps small")
答案 0 :(得分:4)
for app in html_soup.select('.card.no-rationale.square-cover.apps.small'):
title = app.select('.title')[0].text
price = app.select('.price')[0].text
答案 1 :(得分:1)
这只是一种选择。
from bs4 import BeautifulSoup
import requests
import pandas as pd
url = "https://play.google.com/store/apps/category/NEWS_AND_MAGAZINES/collection/topselling_paid"
response =requests.get(url)
html_soup = BeautifulSoup(response.text, 'html.parser')
app_container = html_soup.find_all('div', class_="card no-rationale square-cover apps small")
apptitle=[]
appprice=[]
for app in app_container:
title=app.find('a',class_='title')
title_text=title['title']
apptitle.append(title_text)
price_text=app.find('span',class_="display-price").text
appprice.append(price_text)
df = pd.DataFrame({"App_Title": apptitle, "App_Price": appprice})
print(df)
输出:
App_Price App_Title
0 $3.99 Pocket Casts
1 $2.99 Broadcastify Police Scanner Pro
2 $3.99 Sync for reddit (Pro)
3 $2.99 reddit is fun golden platinum (unofficial)
4 $2.99 Relay for reddit (Pro)
5 $2.99 DoggCatcher Podcast Player
6 $1.99 BaconReader Premium for Reddit
7 $0.99 The Drudge View Pro
8 $3.99 Sync for reddit (Dev)
9 $1.49 Conservative News Pro
10 $4.99 News+ Premium
11 $0.99 Mega Millions + Powerball Lotto Games in US
12 $2.99 VR Browser for Reddit
13 $3.99 Tiny Tiny RSS Unlocker
14 $3.49 Push to Kindle
15 $0.99 The Black Vault
16 $1.69 No Agendroid - No Agenda App
17 $4.99 Police Scanner
18 $0.99 1 Radio News Pro: More Features and Shows, No Ads
19 $0.99 Lotto Results Premium - Lottery Games in US
20 $4.99 JREPro - No Ads
21 $10.99 NHK News Donation Version
22 $0.99 U.S. 270
23 $1.49 Pure news widget (scrollable)
24 $0.99 Lake Okeechobee Levels
25 $0.99 National Catholic Register
26 $0.99 The One America News View Pro
27 $1.49 RSS Reader Pro
28 $3.99 YSN Live
29 $1.99 Ultimate Conspiracy Premium
30 $0.99 News Reader Pro
31 $0.99 Tenno Watcher
32 $13.99 The Aviation Herald
33 $2.96 Metro Reader Pro