我有多个变量名称需要根据常见的文本字符串组合成单个变量。我的样本数据是:
structure(list(And = c(10L, NA, 10L), and = c(20L, 10L, 10L),
andbc = c(1L, NA, NA), baNdc = c(4L, NA, 5L), ban = c(1L,
NA, 1L)), .Names = c("And", "and", "andbc", "baNdc", "ban"), class = "data.frame", row.names = c(NA, -3L))
我想创建一个新变量x,其值将是共享公共文本字符串"和#34;的其他变量的值的行和。忽略该字符串中任何字母的大小写。
我尝试通过指定排列来创建变量,我希望避免这种变换:
names1[, 1:5][is.na(names1[, 1:5])] <- 0
names1$x <- sum(names1[which(grepl("And|and|aNd", names(names1)))])
我得到的x值的结果是符合文本字符串标准的变量的所有值的总和:
structure(list(And = c(10, 0, 10), and = c(20L, 10L, 10L), andbc = c(1, 0, 0), baNdc = c(4, 0, 5), ban = c(1, 0, 1), x = c(70, 70, 70)), .Names = c("And", "and", "andbc", "baNdc", "ban", "x"), row.names = c(NA, -3L), class ="data.frame"
如何根据文本字符串条件获取行总和,并避免必须指定大写或小写的排列?
答案 0 :(得分:2)
以下是诀窍
df <- structure(list(And = c(10L, NA, 10L), and = c(20L, 10L, 10L),
andbc = c(1L, NA, NA), baNdc = c(4L, NA, 5L), ban = c(1L,
NA, 1L)), .Names = c("And", "and", "andbc", "baNdc", "ban"), class = "data.frame", row.names = c(NA, -3L))
x <- rowSums(df[, grep("and", tolower(colnames(df)))], na.rm = TRUE)
答案 1 :(得分:1)
colnames(names1) <- tolower(colnames(names1))
将摆脱你对排列的需求
names1$x <- rowSums(names1[which(grepl('and', colnames(names1)))], na.rm = TRUE)