这是一个非常基本的问题,我陷入了困境。
我正在尝试制作这样的图表:
我有以下DataFrame:
duration start_date start_year start_month start_hour weekday start_city end_city subscription_type
0 1.050000 2013-08-29 2013 8 14 3 San Francisco San Francisco Subscriber
1 1.166667 2013-08-29 2013 8 14 3 San Jose San Jose Subscriber
2 1.183333 2013-08-29 2013 8 10 3 Mountain View Mountain View Subscriber
3 1.283333 2013-08-29 2013 8 11 3 San Jose San Jose Subscriber
4 1.383333 2013-08-29 2013 8 12 3 San Francisco San Francisco Subscriber
我正在使用此代码:
viagem_por_tipo = trip_data.groupby(['subscription_type'],['weekday'])['start_year'].count()
viagem_por_tipo.plot.bar()
收到此错误:
TypeError Traceback (most recent call last)
<ipython-input-40-4b3318e38ba7> in <module>()
----> 1 viagem_por_tipo = trip_data.groupby(['subscription_type'],['weekday'])['start_year'].count()
2 viagem_por_tipo.plot.bar()
C:\Users\Michel Spiero\Anaconda3\lib\site-packages\pandas\core\generic.py in groupby(self, by, axis, level, as_index, sort, group_keys, squeeze, **kwargs)
4266 if level is None and by is None:
4267 raise TypeError("You have to supply one of 'by' and 'level'")
-> 4268 axis = self._get_axis_number(axis)
4269 return groupby(self, by=by, axis=axis, level=level, as_index=as_index,
4270 sort=sort, group_keys=group_keys, squeeze=squeeze,
C:\Users\Michel Spiero\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_axis_number(self, axis)
339
340 def _get_axis_number(self, axis):
--> 341 axis = self._AXIS_ALIASES.get(axis, axis)
342 if is_integer(axis):
343 if axis in self._AXIS_NAMES:
TypeError: unhashable type: 'list'
有人可以帮我吗?
提前致谢。
答案 0 :(得分:1)
请注意,groupby()
需要一个列列表df.groupby(['subscription_type','weekday'])
然后,您需要转动分组的数据框。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({"subscription_type" : np.random.choice(["Subscriber","Customer"], size=240),
"weekday" : np.random.randint(1,8,size=240),
"start_year" : np.ones(240)*2013 })
gr =df.groupby(['subscription_type','weekday'])['start_year'].size().reset_index(name="Count")
piv = pd.pivot_table(gr, values='Count', columns=['subscription_type'],
index = "weekday", aggfunc=np.sum, fill_value=0)
piv.plot(kind="bar")
plt.show()
也许值得注意的是,使用 seaborn countplot
可以获得类似的结果:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame({"subscription_type" : np.random.choice(["Subscriber","Customer"], size=240),
"weekday" : np.random.randint(1,8,size=240),
"start_year" : np.ones(240)*2013 })
sns.countplot(x="weekday", hue="subscription_type", data=df)
plt.show()