我有“住户”,“每个住户人”,“游览”(每个游览包含每个人的不同旅程),“旅程”(每个游览的旅程数)和“方式”(每个旅程的每个人的出行方式)列>
我想要关于游览列的更改模式列如下
mood ==汽车,如果在巡回演出中至少有一次旅行是使用汽车模式
mood ==非汽车,如果旅行中的非旅程都具有mode = car
示例:
household. person. trip. tour. mode
1 1 1 1 car
1 1 2 1 walk
1 1 4 1 bus
1 1 1 2 bus
1 1 2 2 walk
1 2 1 1 walk
1 2 2 1 bus
1 2 3 1 walk
2 1 1 1 walk
2 1 1 1 car
输出
household. person. trip. tour. mode
1 1 1 1 car
1 1 2 1 car
1 1 4 1 car
1 1 1 2 non-car
1 1 2 2 non-car
1 2 1 1 non-car
1 2 2 1 non-car
1 2 3 1 non-car
2 1 1 1 car
2 1 1 1 car
答案 0 :(得分:4)
我们可以按“家庭”,“人”,“游览”分组。并通过检查列中是否有any
'car'将'mode'更改为两个值。在这种情况下,可通过添加1(TRUE-> 2,FALSE-> 1)将其转换为数字索引,并基于该索引传递一个vector
字符串来替换索引
library(dplyr)
df1 %>%
group_by(household., person., tour.) %>%
mutate(mode = c('non-car', 'car')[1+any(mode == "car")])
# A tibble: 10 x 5
# Groups: household., person., tour. [4]
# household. person. trip. tour. mode
# <int> <int> <int> <int> <chr>
# 1 1 1 1 1 car
# 2 1 1 2 1 car
# 3 1 1 4 1 car
# 4 1 1 1 2 non-car
# 5 1 1 2 2 non-car
# 6 1 2 1 1 non-car
# 7 1 2 2 1 non-car
# 8 1 2 3 1 non-car
# 9 2 1 1 1 car
#10 2 1 1 1 car
df1 <- structure(list(household. = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L), person. = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L),
trip. = c(1L, 2L, 4L, 1L, 2L, 1L, 2L, 3L, 1L, 1L), tour. = c(1L,
1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), mode = c("car", "walk",
"bus", "bus", "walk", "walk", "bus", "walk", "walk", "car"
)), class = "data.frame", row.names = c(NA, -10L))