我有一个具有以下结构的数据集
zip code |type of crime
------ |------
1002 |crime1
1002 |crime1
1002 |crime2
1002 |crime1
9210 |crime1
9210 |crime1
9210 |crime2
9210 |crime2
我还有每个犯罪的最低刑罚清单
crime | minimum sentence (days)
------| ------
crime1|10
crime2|15
使用这两个表,我想执行以下操作:
计算每个社区中每项犯罪的总数
zip code | crime |number of crimes
------ | ------ |-----
1002 | crime1 | 3
1002 | crime2 | 1
9210 | crime1 | 2
9210 | crime2 | 2
将每个犯罪乘以它的最小句子,然后计算邻域的总天数。
zip | crime | crimexdays
---- | ------ | -----
1002 | crime1 | 30
1002 | crime2 | 15
9210 | crime1 | 20
9210 | crime2 | 30
我真的很感激这里有任何帮助。干杯!!
答案 0 :(得分:3)
使用count
获取频率,left_join
获取第二个数据集,trasmute
创建新列
df1 %>%
count(zipcode, typeofcrime) %>%
left_join(., df2, by = c("typeofcrime" = "crime")) %>%
transmute(typeofcrime, crimexsentence = n*minimumsentence)
# zipcode typeofcrime crimexsentence
# <int> <chr> <int>
#1 1002 crime1 30
#2 1002 crime2 15
#3 9210 crime1 20
#4 9210 crime2 30