我的数据中有一个名为Pizza Shops的列,其中有州的数字,范围从1万到上百万(由数字组成)。由于某些原因,每个气泡虽然看起来正确,但都显示为相同的颜色(红色)。
我的代码
import plotly.graph_objects as go
import pandas as pd
import os
xl_path = "path to XLSX file"
df = pd.read_excel(open(xl_path, 'rb'), sheet_name='Data')
df.head()
scale = 5000
limits = [(0,15000),(15000,50000),(50000,100000),(100000,500000),(500000,2000000)]
colors = ["red","orange","yellow","green","blue"]
df['Text'] = df['State'] + '<br>Number of Pizza Shops ' + (df['Pizza Shops']).astype(str)
fig = go.Figure()
for i in range(len(limits)):
lim = limits[i]
df_sub = df[lim[0]:lim[1]]
fig.add_trace(go.Scattergeo(
locationmode = 'USA-states',
locations=df['State Code'],
text = df_sub['Text'],
marker = dict(
size = df_sub['Pizza Shops']/scale,
color = colors[i],
line_color='rgb(40,40,40)',
line_width=0.5,
sizemode = 'area'
),
name = '{0} - {1}'.format(lim[0],lim[1])))
fig.update_layout(
title_text = '2019 US Number of Pizza Shops<br>(Click legend to toggle traces)',
showlegend = True,
geo = dict(
scope = 'usa',
landcolor = 'rgb(217, 217, 217)',
)
)
fig.show()
样本数据:
| State | State Code | Pizza Shops |
----------------------------------------
Texas TX 13256
California CA 500235
Idaho ID 4000
.... .... .... and so on
答案 0 :(得分:0)
问题在于,使用df_sub = df[lim[0]:lim[1]]
时,您是在根据行索引而不是根据商店数量来子集数据框。如果您的数据框少于15,000行,则所有数据点都将落在第一个存储桶中,并用红色显示。
如果您想根据商店数量对数据框进行子集化,则应将df_sub = df[lim[0]:lim[1]]
替换为df_sub = df[(df["Pizza Shops"] >= lim[0]) & (df["Pizza Shops"] < lim[1])]
。
import plotly.graph_objects as go
import pandas as pd
df = pd.DataFrame({"State": ["Texas", "California", "Idaho", "Alabama", "Arizona", "Georgia", "Washington"],
"State Code": ["TX", "CA", "ID", "AL", "AZ", "GA", "WA"],
"Pizza Shops": [12500, 25000, 75000, 250000, 1000000, 15000, 100000]})
df["Text"] = df["State"] + "<br>Number of Pizza Shops " + (df["Pizza Shops"]).astype(str)
scale = 2000
limits = [(0,15000),(15000,50000),(50000,100000),(100000,500000),(500000,2000000)]
colors = ["red", "orange", "yellow", "green", "blue"]
fig = go.Figure()
for i in range(len(limits)):
lim = limits[i]
df_sub = df[(df["Pizza Shops"] >= lim[0]) & (df["Pizza Shops"] < lim[1])]
fig.add_trace(go.Scattergeo(
locationmode="USA-states",
locations=df_sub["State Code"],
text=df_sub["Text"],
marker=dict(
size=df_sub["Pizza Shops"]/scale,
color=colors[i],
line_color="rgb(40,40,40)",
line_width=0.5,
sizemode="area"),
name="{0} - {1}".format(lim[0],lim[1])))
fig.update_layout(
title_text="2019 US Number of Pizza Shops<br>(Click legend to toggle traces)",
showlegend=True,
geo=dict(scope="usa", landcolor="rgb(217, 217, 217)")
)
fig.show()