我想从网站上提取数据,并建立一个包含字典的列表。
/PycharmProjects/MyFirstOne/WEBSCRAPPING/Work with Soup data.py"
$125.70 [{'price': '$125.70', 'price per gramm': ''}]
$35.70 [{'price': '$35.70', 'price per gramm': ''}, {'price': '$35.70', 'price per gramm': ''}]
Process finished with exit code 0
结果,我得到一个附加列表,但是连续的所有价格都相同。
with open('geojson\plotly--master\china.json',encoding='utf-8') as f:
provinces_map = json.load(f)
data = pd.read_excel('D:\ALTF\jupyterlab\\railway_demand_analysis\data\\transportation_province.xlsx')
token = 'pk.eyJ1IjoidGFuZ2ZlaWxpIiwiYSI6ImNrOGZ1cGp5YjA4ZzUzZ29kdnQ0ZzUzNXUifQ.TG9Ul9DA6t4reVAbQh_5LA'
fig = px.choropleth_mapbox(
data_frame= data,
geojson=provinces_map,
color='railway',
locations = 'id',
hover_name= 'region',
hover_data=['年份'],
animation_frame='年份',
featureidkey='properties.NL_NAME_1',
mapbox_style='carto_darkmatter',
color_continuous_scale='viridis',
center={'lat':37.110573, "lon": 106.493924},
zoom=3
)
fig.update_layout(mapbox = {'accesstoken':token,'center':{"lat": 37.110573, "lon": 106.493924},'zoom':3,'style':"light"},
title = {'text':'各省份货运量分布情况'})
#fig.show()
plot( fig, validate=False, filename='d4-great-circle.html' )
请帮助我解决此问题。
答案 0 :(得分:0)
好像您每次都在修改“ item”对象并将其重新插入。尝试将第一行移入循环:
list_of_sku = []
with open('./FULL_data.html', 'r') as f:
data = f.read()
soup = BeautifulSoup(open("./FULL_data.html"), "html.parser")
for divs in soup.find_all('div', attrs={"class": "col-xs-6 col-sm-4"})[:2]:
links = divs.find_all("tr")
for row in links:
# We get list of prices here
item_text = row.find('td')
if item_text:
item = {"price": "", "price per gramm": ""}
item["price"] = str(item_text.text)
print(item["price"])
list_of_sku.append(item)
print(list_of_sku)