我的数据框(df)有两个变量,位置和天气。
我喜欢宽数据框(dfgoal),其中数据按位置分组,其中有三个新变量(weather_1到weather_3),其中包含原始天气变量中的观测值。
问题是当我尝试使用dplyr():: mutate()时,我只得到TRUE / FALSE输出而不是计数,或者是错误消息:Evaluation error: no applicable method for 'summarise_' applied to an object of class "logical"
。
非常感谢任何帮助。
起点(df):
df <- data.frame(location=c("az","az","az","az","bi","bi","bi","ca","ca","ca","ca","ca"),weather=c(1,1,2,3,2,3,2,1,2,3,1,2))
期望结果(df):
dfgoal <- data.frame(location=c("az","bi","ca"),weather_1=c(2,0,2),weather_2=c(1,2,2),weather_3=c(1,1,1))
当前代码:
library(dplyr)
df %>% group_by(location) %>% mutate(weather_1 = (weather == 1)) %>% mutate(weather_2 = (weather == 2)) %>% mutate(weather_3 = (weather == 3))
df %>% group_by(location) %>% mutate(weather_1 = summarise(weather == 1)) %>% mutate(weather_2 = summarise(weather == 2)) %>% mutate(weather_3 = summarise(weather == 3))
答案 0 :(得分:3)
使用名为 table 的函数非常简单:
df %>% table
weather
location 1 2 3
az 2 1 1
bi 0 2 1
ca 2 2 1
答案 1 :(得分:1)
Krzysztof的解决方案是可行的方法,但如果您坚持使用tidyverse
,则此处是dplyr
+ tidyr
的解决方案:
library(dplyr)
library(tidyr)
df %>%
group_by(location, weather) %>%
summarize(count = count(weather)) %>%
spread(weather, count, sep="_") %>%
mutate_all(funs(coalesce(., 0L)))
<强>结果:强>
# A tibble: 3 x 4
# Groups: location [3]
location weather_1 weather_2 weather_3
<fctr> <int> <int> <int>
1 az 2 1 1
2 bi 0 2 1
3 ca 2 2 1
答案 2 :(得分:0)
Krzysztof的答案很简单,但是如果你想要一个只有tidyverse的解决方案(dplyr
和tidyr
):
df %>%
group_by(location, weather) %>%
summarize(bin = sum(weather==weather)) %>%
spread(weather, bin, fill = 0, sep='_')
这导致:
location weather_1 weather_2 weather_3
az 2 1 1
bi 0 2 1
ca 2 2 1