我目前正在尝试从此页面网页抓取条形图/图表,但是不确定要提取这些类型的条形图需要哪些特定的BeautifulSoup功能。此外,如果有人链接到BeautifulSoup功能用于刮取什么类型的图表/图形,将不胜感激。 https://www.statista.com/statistics/215655/number-of-registered-weapons-in-the-us-by-state/
这是我到目前为止的代码
import pandas as pd
import requests
from bs4 import BeautifulSoup
dp = 'https://www.statista.com/statistics/215655/number-of-registered-weapons-in-the-us-by-state/'
page = requests.get(dp).text
soup = BeautifulSoup(page, 'html.parser')
#This is what I am trying to figure out
new = soup.find("div", id="bar")
print(new)
答案 0 :(得分:2)
此脚本将从条形图中获取所有数据:
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = 'https://www.statista.com/statistics/215655/number-of-registered-weapons-in-the-us-by-state/'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
tds = soup.select('#statTableHTML td')
data = []
for td1, td2 in zip(tds[::2], tds[1::2]):
data.append({'State':td1.text, 'Number': td2.text})
df = pd.DataFrame(data)
print(df)
打印:
State Number
0 Texas 725,368
1 Florida 432,581
2 California 376,666
3 Virginia 356,963
4 Pennsylvania 271,427
5 Georgia 225,993
6 Arizona 204,817
7 North Carolina 181,209
8 Ohio 175,819
9 Alabama 168,265
10 Illinois 147,698
11 Wyoming 134,050
12 Indiana 133,594
13 Maryland 128,289
14 Tennessee 121,140
15 Washington 119,829
16 Louisiana 116,398
17 Colorado 112,691
18 Arkansas 108,801
19 New Mexico 105,836
20 South Carolina 99,283
21 Minnesota 98,585
22 Nevada 96,822
23 Kentucky 93,719
24 Utah 93,440
25 New Jersey 90,217
26 Missouri 88,270
27 Michigan 83,355
28 Oklahoma 83,112
29 New York 82,917
30 Wisconsin 79,639
31 Connecticut 74,877
32 Oregon 74,722
33 District of Columbia 59,832
34 New Hampshire 59,341
35 Idaho 58,797
36 Kansas 54,409
37 Mississippi 52,346
38 West Virginia 41,651
39 Massachusetts 39,886
40 Iowa 36,540
41 South Dakota 31,134
42 Nebraska 29,753
43 Montana 23,476
44 Alaska 20,520
45 North Dakota 19,720
46 Maine 17,410
47 Hawaii 8,665
48 Vermont 7,716
49 Delaware 5,281
50 Rhode Island 4,655
51 *Other US Territories 866
答案 1 :(得分:1)
也许您可以从此网站https://www.datacamp.com/community/tutorials/web-scraping-using-python
中找到有关Web爬网的更多信息。