从dplyr :: mutate中ifelse中的另一列传输值

时间:2018-11-30 15:17:51

标签: r if-statement dplyr mutate

我有数据框dd(问题底部的投放):

# A tibble: 6 x 2
# Groups:   Date [5]
  Date     keeper
  <chr>    <lgl> 
1 1/1/2018 TRUE  
2 2/1/2018 TRUE  
3 3/1/2018 FALSE 
4 4/1/2018 FALSE 
5 3/1/2018 TRUE  
6 5/1/2018 TRUE 

请注意,它已经按日期分组。我正在尝试创建另一列,如果组中只有一行,它将“ keeper”变为TRUE,否则将保留keeper的值。这似乎很简单,但是当我尝试这样做时,得到了以下结果:

dd %>% mutate(moose=ifelse(n()==1,TRUE,keeper))
# A tibble: 6 x 3
# Groups:   Date [5]
  Date     keeper moose
  <chr>    <lgl>  <lgl>
1 1/1/2018 TRUE   TRUE 
2 2/1/2018 TRUE   TRUE 
3 3/1/2018 FALSE  FALSE
4 4/1/2018 FALSE  TRUE 
5 3/1/2018 TRUE   FALSE
6 5/1/2018 TRUE   TRUE 

请注意,第3行和第5行具有相同的日期,因此它们应该只保留了新列的管家中的内容-但它们都变成了FALSE。我想念什么?

预期输出:

  Date     keeper moose
  <chr>    <lgl>  <lgl>
1 1/1/2018 TRUE   TRUE 
2 2/1/2018 TRUE   TRUE 
3 3/1/2018 FALSE  FALSE
4 4/1/2018 FALSE  TRUE 
5 3/1/2018 TRUE   TRUE
6 5/1/2018 TRUE   TRUE 

(请注意第5行)

以下是数据框的输出:

dd<-structure(list(Date = c("1/1/2018", "2/1/2018", "3/1/2018", "4/1/2018", 
"3/1/2018", "5/1/2018"), keeper = c(TRUE, TRUE, FALSE, FALSE, 
TRUE, TRUE)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L), vars = "Date", drop = TRUE, indices = list(
    0L, 1L, c(2L, 4L), 3L, 5L), group_sizes = c(1L, 1L, 2L, 1L, 
1L), biggest_group_size = 2L, labels = structure(list(Date = c("1/1/2018", 
"2/1/2018", "3/1/2018", "4/1/2018", "5/1/2018")), class = "data.frame", row.names = c(NA, 
-5L), vars = "Date", drop = TRUE, indices = list(0L, 1L, 2L, 
    4L, 3L, 5L), group_sizes = c(1L, 1L, 1L, 1L, 1L, 1L), biggest_group_size = 1L, labels = structure(list(
    Date = c("1/1/2018", "2/1/2018", "3/1/2018", "3/1/2018", 
    "4/1/2018", "5/1/2018"), keeper = c(TRUE, TRUE, FALSE, TRUE, 
    FALSE, TRUE)), class = "data.frame", row.names = c(NA, -6L
), vars = c("Date", "keeper"), drop = TRUE, .Names = c("Date", 
"keeper")), .Names = "Date"), .Names = c("Date", "keeper"))

附录:

在继续使用此数据帧时,我发现如果我首先使用n创建一列add_count,然后在我的ifelse中引用该列,而不是{{ 1}},我得到了想要的结果。是什么原因造成的? n()为什么不能给我相同的结果?

1 个答案:

答案 0 :(得分:1)

有回收作用。对于ifelse,我们需要参数具有相同的长度。 length的{​​{1}}为1。第二个参数n()的长度为1。因此,TRUE与第三个参数'keeper'的不匹配为{ {1}} length。这造成了回收的不平衡。附录中提到的OP指出,如果创建了列,那么问题就不存在了。原因是创建该列后,“ n”列的length不是1,而是n()

length

另外,由于n()的{​​{1}}为1,我们可以使用dd %>% mutate(moose = ifelse(rep(n(), n()) == 1, TRUE, keeper)) # A tibble: 6 x 3 # Groups: Date [5] # Date keeper moose # <chr> <lgl> <lgl> #1 1/1/2018 TRUE TRUE #2 2/1/2018 TRUE TRUE #3 3/1/2018 FALSE FALSE #4 4/1/2018 FALSE TRUE #5 3/1/2018 TRUE TRUE #6 5/1/2018 TRUE TRUE

length