熊猫-新列中的行数和分组依据

时间:2018-11-27 18:37:52

标签: python pandas

我有this data set,并且我想显示具有3个以上受害者的所有犯罪的列(“警察区名称”,“犯罪数量”)。但是,“犯罪数量”列不存在并显示为创建状态,它指示(以及在该地区实施的犯罪总数)。注意:每一行表示1次犯罪。

数据集示例:

Incident ID Victims Police District Name Beat
0   201087096   1      GERMANTOWN        5N1
1   201087097   1        WHEATON         4K2
2   201087097   1        WHEATON         4K2
3   201087097   1        WHEATON         4K2
4   201087100   1      GERMANTOWN        5M1

这是我的代码:

import pandas as pd

crimes_df = pd.read_csv('data/Crime.csv', low_memory=False, dtype={'Incident ID': int, 'Beat':object})
more_than_three_victims = crimes_df[(crimes_df['Victims'] > 3)]
more_than_three_victims.groupby(['Police District Name']).sum()

我不知道从这里做什么,我将不胜感激。

1 个答案:

答案 0 :(得分:1)

因此,最初读取数据时,不必从所有列中创建一个df:

crimes_df = pd.read_csv('./Desktop/Crime.csv', usecols=['Police District Name', 'Victims'])
# The above will only read in the columns listed
more_than_three_victims = crimes_df[(crimes_df['Victims'] > 3)] # filter based on 3 crimes
groupby_victims = more_than_three_victims.groupby('Police District Name')['Victims'].agg(['sum']).rename(columns = {'sum': 'Number of Victims'})
print(groupby_victims)

输出:

                      Number of Victims
Police District Name                  
BETHESDA                            52
GERMANTOWN                         106
MONTGOMERY VILLAGE                 104
ROCKVILLE                           73
SILVER SPRING                      107
TAKOMA PARK                          4
WHEATON                             78

这将按“警区名称”分组并汇总每个分区中的受害者人数,然后将“ sum”列重命名为“犯罪数量”。我相信这就是您想要的。

如果您要统计3个以上的受害者的犯罪数量:

groupby_victims = more_than_three_victims.groupby('Police District Name')['Victims'].agg(['count']).rename(columns ={'count': 'Number of Crimes'})
# you just change 'sum' to 'count'

输出:

                      Number of Crimes
Police District Name                  
BETHESDA                             9
GERMANTOWN                          23
MONTGOMERY VILLAGE                  21
ROCKVILLE                           15
SILVER SPRING                       21
TAKOMA PARK                          1
WHEATON                             18

同样,这将是犯罪数量,而不是受害者的总数。