summarise()错误:无法将组1提升为数字

时间:2018-10-09 13:03:41

标签: r dplyr

我有以下数据:

structure(list(qnrA1 = c("0", "0", "0", "0", "0", "0", "0", "0", 
"0", "0"), qnrB19 = c("0", "0", "0", "0", "0", "0", "0", "0", 
"0", "0"), qnrB6 = c("0", "0", "0", "0", "0", "0", "0", "0", 
"0", "0"), qnrB60 = c("0", "0", "0", "0", "0", "0", "0", "0", 
"0", "0"), qnrS1 = c("0", "0", "0", "0", "0", "0", "0", "0", 
"0", "0"), qnrS2 = c("0", "0", "0", "0", "0", "0", "0", "0", 
"0", "0"), qnrS4 = c("0", "0", "0", "0", "0", "0", "0", "0", 
"0", "0"), gyrA = c("S83L", "S83L", "S83L", "S83L", "S83L", "S83L, D87N", 
"S83L", "S83L", "S83A", "S83L, D87N"), gyrB = c("0", "0", "0", 
"0", "0", "0", "0", "0", "0", "0"), parC = c("0", "S80I", "0", 
"0", "0", "S80I", "0", "0", "0", "S58I"), parE = c("0", "0", 
"0", "0", "0", "0", "0", "D475E", "0", "0"), marR = c("1", "1", 
"0", "1", "1", "1", "0", "1", "1", "0"), CIP = c(0.25, 1, 0.5, 
0.25, 0.25, 8, 0.12, 0.25, 0.06, 16), NAL = c(128L, 256L, 256L, 
256L, 64L, 256L, 64L, 128L, 32L, 256L)), row.names = c(NA, -10L
), class = c("tbl_df", "tbl", "data.frame"))

我想做的是将除CIP / NAL列以外的所有列分组,然后做一个等价的count(),但是我想用在CIP / NAL中为每个组找到的值创建新列。如果CIP中的最大值等于NAL值,我只希望该值在列中。如果它们不相等,我希望它们在同一行和同一列中,并用“-”分隔。 我尝试了以下方法:

library(dplyr)
df %>%
  group_by_at(vars(-c(CIP, NAL))) %>%
  summarise(CIP = ifelse(min(as.numeric(CIP)) == max(as.numeric(CIP)),
                         median(as.numeric(CIP)),
                         paste(min(as.numeric(CIP)), max(as.numeric(CIP)), sep = "-")),
            n = n())

但是,当我运行它时,出现以下错误:

Error in summarise_impl(.data, dots) : 
  Column `CIP` can't promote group 1 to numeric

当每个组的min(as.numeric(CIP))= / = max(as.numeric(CIP))时,似乎发生了错误,因为当我将粘贴行更改为“ 0”时,它似乎可以正常工作在ifelse函数中。关于此错误的含义有何建议?

1 个答案:

答案 0 :(得分:3)

我不赞成该错误,但是如果查看ifelse语句,则在条件为true的情况下会得到一个数值,而在条件为false的情况下会得到一个字符。我认为这导致了错误。您可以尝试以下方法:

df %>%
  group_by_at(vars(-c(CIP, NAL))) %>%
  summarise(CIP = ifelse(min(as.numeric(CIP)) == max(as.numeric(CIP)),
                         as.character(median(as.numeric(CIP))),
                         paste(min(as.numeric(CIP)), max(as.numeric(CIP)), sep = "-")),
            n = n())