如何更改数据集中包含某些信息的列?

时间:2019-07-29 16:36:22

标签: r dataframe

我有“住户”,“每个住户人”,“游览”(每个游览包含每个人的不同旅程),“旅程”(每个游览的旅程数)和“方式”(每个旅程的每个人的出行方式)列

我想要关于游览列的更改模式列如下

mood ==汽车,如果在巡回演出中至少有一次旅行是使用汽车模式

mood ==非汽车,如果旅行中的非旅程都具有mode = car

示例:

   household.  person.  trip.   tour.    mode
       1         1        1       1       car
       1         1        2       1       walk
       1         1        4       1       bus
       1         1        1       2       bus
       1         1        2       2       walk
       1         2        1       1       walk
       1         2        2       1       bus
       1         2        3       1       walk
       2         1        1       1       walk
       2         1        1       1       car

输出

   household.  person.  trip.   tour.    mode
       1         1        1       1       car
       1         1        2       1       car
       1         1        4       1       car
       1         1        1       2       non-car
       1         1        2       2       non-car
       1         2        1       1       non-car
       1         2        2       1       non-car
       1         2        3       1       non-car
       2         1        1       1       car
       2         1        1       1       car

1 个答案:

答案 0 :(得分:4)

我们可以按“家庭”,“人”,“游览”分组。并通过检查列中是否有any'car'将'mode'更改为两个值。在这种情况下,可通过添加1(TRUE-> 2,FALSE-> 1)将其转换为数字索引,并基于该索引传递一个vector字符串来替换索引

library(dplyr)
df1 %>% 
    group_by(household., person., tour.) %>%
    mutate(mode = c('non-car', 'car')[1+any(mode == "car")])
# A tibble: 10 x 5
# Groups:   household., person., tour. [4]
#   household. person. trip. tour. mode   
#        <int>   <int> <int> <int> <chr>  
# 1          1       1     1     1 car    
# 2          1       1     2     1 car    
# 3          1       1     4     1 car    
# 4          1       1     1     2 non-car
# 5          1       1     2     2 non-car
# 6          1       2     1     1 non-car
# 7          1       2     2     1 non-car
# 8          1       2     3     1 non-car
# 9          2       1     1     1 car    
#10          2       1     1     1 car    

数据

df1 <- structure(list(household. = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L), person. = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L), 
    trip. = c(1L, 2L, 4L, 1L, 2L, 1L, 2L, 3L, 1L, 1L), tour. = c(1L, 
    1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), mode = c("car", "walk", 
    "bus", "bus", "walk", "walk", "bus", "walk", "walk", "car"
    )), class = "data.frame", row.names = c(NA, -10L))