我有一个我使用Pandas read_csv函数导入的CSV数据集,当我运行.head()时,我得到以下表格输出:
LSOA code Crime type
0 E01006687 Burglary
1 E01007229 Anti-social behaviour
2 E01007229 Anti-social behaviour
3 E01007229 Anti-social behaviour
4 E01007229 Burglary
5 E01007229 Other theft
6 E01007229 Other theft
7 E01007229 Shoplifting
8 E01007229 Theft from the person
9 E01007230 Anti-social behaviour
10 E01007230 Anti-social behaviour
11 E01007230 Anti-social behaviour
12 E01007230 Anti-social behaviour
13 E01007230 Anti-social behaviour
14 E01007230 Anti-social behaviour
15 E01007230 Anti-social behaviour
16 E01007230 Anti-social behaviour
17 E01007230 Anti-social behaviour
18 E01007230 Anti-social behaviour
19 E01007230 Anti-social behaviour
此表包含超过33,000行。我需要做的是获得LSOA代码的所有独特价值' - 其中有207个,然后对于每个LSOA代码',我需要一个值来表示每个犯罪类型的出现次数' ..其中约有30个,然后是每个LSOA代码的总犯罪总和
例如:我喜欢以下类型的输出表,其中' LSOA代码'是索引列:
LSOA code | Burglary | Anti-social Behavior | Bicycle Theft | Assault ... | Total
E01000067 | 32 | 21 | 8 | 43 ... | 1023
E01000043 | 98 | 65 | 5 | 73 ... | 2308
E01000237 | 38 | 34 | 12 | 92 ... | 897
E01000038 | 82 | 28 | 3 | 18 ... | 2147
等
我设法将LSOA代码放入数据框中,每个LSOA中的犯罪总数使用以下内容:
WirralCrimes = Crimes['LSOA code'].value_counts()
CrimeDF = pd.DataFrame(pd.Series(WirralCrimes))
CrimeDF.columns = ["Count"]
..但我无法弄清楚如何将每种犯罪类型列入一个专栏并总结每个LSOA的出现情况
有人能指出我应该做些什么吗?
非常感谢
答案 0 :(得分:0)
如果你有类似的数据,这应该现在可以使用:
df = DataFrame({'LSOA code':['E01006687','E01007229','E01007229','E01007229','E01007229','E01007229','E01007229','E01007229','E01007230','E01007230']
, 'Crime type':['Burglary','Anti-social behaviour','Anti-social behaviour','Anti-social behaviour','Burglary','Other theft','Other theft','Shoplifting','Theft from the person','Anti-social behaviour']})
your_data['count'] = 1
table = pandas.pivot_table(your_data, index='LSOA code', columns='Crime type',values='count',aggfunc='sum')
table ["total"] = table.sum(axis=1)