我正在和熊猫一起工作,我使用了groupby:
group = df_crimes_query.groupby(["CrimeDateTime", "WeaponFactor"]).size()
group.head(20)
CrimeDateTime WeaponFactor
2016-01-01 FIREARM 11
HANDS 26
KNIFE 3
OTHER 11
UNDEFINED 102
2016-01-02 FIREARM 10
HANDS 21
KNIFE 8
OTHER 6
UNDEFINED 68
2016-01-03 FIREARM 12
HANDS 13
KNIFE 6
OTHER 5
UNDEFINED 73
2016-01-04 FIREARM 11
HANDS 10
KNIFE 1
OTHER 3
UNDEFINED 84
dtype: int64
它的类型是系列:
type(group)
pandas.core.series.Series
我想要一个关于这样的数据框:
CrimeDateTime FIREARM HANDS KNIFE OTHER UNDEFINED
2016-01-01 11 26 3 11 102
2016-01-02 10 21 8 6 68
2016-01-03 12 13 6 5 73
2016-01-04 11 10 1 3 84
我想使用这个数据帧来绘制五个时间序列,每个类型一个(FIREARM,HANDS等)。我尝试过,在网上搜索过,但没有成功。
代码在我的GitHub中(在名为Testing的部分中):https://github.com/rmmariano/CAP386_intro_data_science/blob/master/projeto/crimes_baltimore/crimes_baltimore.ipynb
我有其他测试代码,但我删除了最清楚。
有人有任何想法吗?
答案 0 :(得分:2)
选项1
简单而缓慢
pd.crosstab(df.CrimeDateTime, df.WeaponFactor)
WeaponFactor FIREARM HANDS KNIFE OTHER UNDEFINED
CrimeDateTime
2016-01-01 11 26 3 11 102
2016-01-02 10 21 8 6 68
2016-01-03 12 13 6 5 73
2016-01-04 11 10 1 3 84
选项2
更快更酷!
pd.get_dummies(df.CrimeDateTime).T.dot(pd.get_dummies(df.WeaponFactor))
FIREARM HANDS KNIFE OTHER UNDEFINED
2016-01-01 11 26 3 11 102
2016-01-02 10 21 8 6 68
2016-01-03 12 13 6 5 73
2016-01-04 11 10 1 3 84
选项3
下一级功夫熊猫!
i, r = pd.factorize(df.CrimeDateTime.values)
j, c = pd.factorize(df.WeaponFactor.values)
n, m = r.size, c.size
b = np.bincount(j + i * m, minlength=n * m).reshape(n, m)
pd.DataFrame(b, r, c)
FIREARM HANDS KNIFE OTHER UNDEFINED
2016-01-01 11 26 3 11 102
2016-01-02 10 21 8 6 68
2016-01-03 12 13 6 5 73
2016-01-04 11 10 1 3 84
答案 1 :(得分:1)
您将使用
获得所需的结果df_crimes_query.groupby(["CrimeDateTime", "WeaponFactor"]).size().unstack().reset_index()
答案 2 :(得分:1)
您可以使用数据透视表,而不是groupby,即
df.pivot_table(index='CrimeDateTime',columns='WeaponFactor',values='count')
如果您有像这样的数据框,请根据笔记本中的代码
CrimeDateTime WeaponFactor count 0 2016-01-01 FIREARM 11 1 2016-01-01 HANDS 26 2 2016-01-01 KNIFE 3 3 2016-01-01 OTHER 11 4 2016-01-01 UNDEFINED 102 5 2016-01-02 FIREARM 10 6 2016-01-02 HANDS 21 7 2016-01-02 KNIFE 8 8 2016-01-02 OTHER 6 9 2016-01-02 UNDEFINED 68 10 2016-01-03 FIREARM 12 11 2016-01-03 HANDS 13 12 2016-01-03 KNIFE 6 13 2016-01-03 OTHER 5 14 2016-01-03 UNDEFINED 73 15 2016-01-04 FIREARM 11 16 2016-01-04 HANDS 10 17 2016-01-04 KNIFE 1 18 2016-01-04 OTHER 3 19 2016-01-04 UNDEFINED 84
输出:
df.pivot_table(index='CrimeDateTime',columns='WeaponFactor',values='count')
WeaponFactor FIREARM HANDS KNIFE OTHER UNDEFINED CrimeDateTime 2016-01-01 11 26 3 11 102 2016-01-02 10 21 8 6 68 2016-01-03 12 13 6 5 73 2016-01-04 11 10 1 3 84 In [595]: