我有以下数据:
structure(list(qnrA1 = c("0", "0", "0", "0", "0", "0", "0", "0",
"0", "0"), qnrB19 = c("0", "0", "0", "0", "0", "0", "0", "0",
"0", "0"), qnrB6 = c("0", "0", "0", "0", "0", "0", "0", "0",
"0", "0"), qnrB60 = c("0", "0", "0", "0", "0", "0", "0", "0",
"0", "0"), qnrS1 = c("0", "0", "0", "0", "0", "0", "0", "0",
"0", "0"), qnrS2 = c("0", "0", "0", "0", "0", "0", "0", "0",
"0", "0"), qnrS4 = c("0", "0", "0", "0", "0", "0", "0", "0",
"0", "0"), gyrA = c("S83L", "S83L", "S83L", "S83L", "S83L", "S83L, D87N",
"S83L", "S83L", "S83A", "S83L, D87N"), gyrB = c("0", "0", "0",
"0", "0", "0", "0", "0", "0", "0"), parC = c("0", "S80I", "0",
"0", "0", "S80I", "0", "0", "0", "S58I"), parE = c("0", "0",
"0", "0", "0", "0", "0", "D475E", "0", "0"), marR = c("1", "1",
"0", "1", "1", "1", "0", "1", "1", "0"), CIP = c(0.25, 1, 0.5,
0.25, 0.25, 8, 0.12, 0.25, 0.06, 16), NAL = c(128L, 256L, 256L,
256L, 64L, 256L, 64L, 128L, 32L, 256L)), row.names = c(NA, -10L
), class = c("tbl_df", "tbl", "data.frame"))
我想做的是将除CIP / NAL列以外的所有列分组,然后做一个等价的count(),但是我想用在CIP / NAL中为每个组找到的值创建新列。如果CIP中的最大值等于NAL值,我只希望该值在列中。如果它们不相等,我希望它们在同一行和同一列中,并用“-”分隔。 我尝试了以下方法:
library(dplyr)
df %>%
group_by_at(vars(-c(CIP, NAL))) %>%
summarise(CIP = ifelse(min(as.numeric(CIP)) == max(as.numeric(CIP)),
median(as.numeric(CIP)),
paste(min(as.numeric(CIP)), max(as.numeric(CIP)), sep = "-")),
n = n())
但是,当我运行它时,出现以下错误:
Error in summarise_impl(.data, dots) :
Column `CIP` can't promote group 1 to numeric
当每个组的min(as.numeric(CIP))= / = max(as.numeric(CIP))时,似乎发生了错误,因为当我将粘贴行更改为“ 0”时,它似乎可以正常工作在ifelse函数中。关于此错误的含义有何建议?
答案 0 :(得分:3)
我不赞成该错误,但是如果查看ifelse
语句,则在条件为true的情况下会得到一个数值,而在条件为false的情况下会得到一个字符。我认为这导致了错误。您可以尝试以下方法:
df %>%
group_by_at(vars(-c(CIP, NAL))) %>%
summarise(CIP = ifelse(min(as.numeric(CIP)) == max(as.numeric(CIP)),
as.character(median(as.numeric(CIP))),
paste(min(as.numeric(CIP)), max(as.numeric(CIP)), sep = "-")),
n = n())