我正试图从此Wikipedia Article获取数据,其中包含每个国家公园的表格以及每个公园的一些详细信息。通过从我发现的类似教程中更改代码,我可以显示每个公园的名称和状态,因为公园所在的区域无法正常工作。我不确定这是因为名称和状态是Wikipedia文章中的链接,尽管我不确定。我该如何更改代码才能显示该区域?
import requests
from bs4 import BeautifulSoup
URL = "https://en.wikipedia.org/wiki/List_of_national_parks_of_the_United_States"
res = requests.get(URL).text
soup = BeautifulSoup(res,'html.parser')
for items in soup.find('table', class_='wikitable').find_all('tr')[1::1]:
data = items.find_all(['th','td'])
try:
parkName = data[0].a.text
parkState = data[2].a.text
parkArea = data[4].span.text
except IndexError:pass
print("{} | {} | {}".format(parkName, parkState, parkArea))
答案 0 :(得分:1)
要获取该区域的文本,可以使用.get_text()
,然后使用str.rsplit()
仅获取以英亩为单位的面积:
import requests
from bs4 import BeautifulSoup
url = "https://en.wikipedia.org/wiki/List_of_national_parks_of_the_United_States"
soup = BeautifulSoup(requests.get(url).content,'html.parser')
rows = iter(soup.select('.wikitable tr:has(td, th)'))
next(rows) # skip headers
for tr in rows:
name, _, state, _, area, *_ = tr.select('td, th')
name = name.get_text(strip=True)
state = state.a.get_text(strip=True)
area = area.get_text(strip=True).rsplit(maxsplit=2)[0]
print('{:<35}{:<25}{}'.format(name, state, area))
打印:
Acadia Maine 49,076.63 acres
American Samoa American Samoa 8,256.67 acres
Arches Utah 76,678.98 acres
Badlands South Dakota 242,755.94 acres
Big Bend Texas 801,163.21 acres
Biscayne Florida 172,971.11 acres
Black Canyon of the Gunnison Colorado 30,779.83 acres
Bryce Canyon Utah 35,835.08 acres
Canyonlands Utah 337,597.83 acres
Capitol Reef Utah 241,904.50 acres
Carlsbad Caverns* New Mexico 46,766.45 acres
Channel Islands California 249,561.00 acres
Congaree South Carolina 26,476.47 acres
Crater Lake Oregon 183,224.05 acres
Cuyahoga Valley Ohio 32,571.88 acres
Death Valley California 3,408,406.73 acres
Denali Alaska 4,740,911.16 acres
Dry Tortugas Florida 64,701.22 acres
Everglades Florida 1,508,938.57 acres
Gates of the Arctic Alaska 7,523,897.45 acres
Gateway Arch Missouri 192.83 acres
Glacier Montana 1,013,125.99 acres
Glacier Bay Alaska 3,223,383.43 acres
Grand Canyon* Arizona 1,201,647.03 acres
Grand Teton Wyoming 310,044.36 acres
Great Basin Nevada 77,180.00 acres
Great Sand Dunes Colorado 107,341.87 acres
Great Smoky Mountains North Carolina 522,426.88 acres
Guadalupe Mountains Texas 86,367.10 acres
Haleakalā Hawaii 33,264.62 acres
Hawaiʻi Volcanoes Hawaii 325,605.28 acres
Hot Springs Arkansas 5,554.15 acres
Indiana Dunes Indiana 15,349.08 acres
Isle Royale Michigan 571,790.30 acres
Joshua Tree California 795,155.85 acres
Katmai Alaska 3,674,529.33 acres
Kenai Fjords Alaska 669,650.05 acres
Kings Canyon California 461,901.20 acres
Kobuk Valley Alaska 1,750,716.16 acres
Lake Clark Alaska 2,619,816.49 acres
Lassen Volcanic California 106,589.02 acres
Mammoth Cave Kentucky 54,011.91 acres
Mesa Verde* Colorado 52,485.17 acres
Mount Rainier Washington 236,381.64 acres
North Cascades Washington 504,780.94 acres
Olympic Washington 922,649.41 acres
Petrified Forest Arizona 221,390.21 acres
Pinnacles California 26,685.73 acres
Redwood* California 138,999.37 acres
Rocky Mountain Colorado 265,807.25 acres
Saguaro Arizona 91,715.72 acres
Sequoia California 404,062.63 acres
Shenandoah Virginia 199,223.77 acres
Theodore Roosevelt North Dakota 70,446.89 acres
Virgin Islands U.S. Virgin Islands 15,052.53 acres
Voyageurs Minnesota 218,222.35 acres
White Sands New Mexico 146,344.31 acres
Wind Cave South Dakota 33,970.84 acres
Wrangell–St. Elias* Alaska 8,323,146.48 acres
Yellowstone Wyoming 2,219,790.71 acres
Yosemite* California 761,747.50 acres
Zion Utah 147,242.66 acres
答案 1 :(得分:0)
您可以更改此行:
http://localhost/api/user?api_token={token}
如果要在英亩中使用该面积,则为
parkArea = data[4].span.text
或以km2为单位:
parkArea = data[4].text.split(' ')[0]