看起来像一个简单的问题,但它让我困扰了几个小时,而且侦探一直没有成功。
我有一个大型数据框,其元素包含字符“ABCD”等。
如果第1和第3个子字符串不匹配,我想用NA替换元素:
“DAD”“MOM”“BABOON”“SISTER”元素将保持不变(因为第一和第三个子串匹配,但“CAT”“STEP”“JULIAN”将设置为NA。每个元素的长度为动态,但它始终是我感兴趣的第一和第三子串。
Work\app\cache\dev\classes.php line 4990
Context: { "exception": "Object(Twig_Error_Runtime)" }
在其他尝试中,我觉得这是我最接近的:
> dput(d)
structure(list(v1 = structure(c(6L, 2L, 1L, 3L, 4L, 5L), .Label = c("BABOON",
"BOB", "BOO", "CAR", "CAT", "JULIAN"), class = "factor"), v2 = structure(c(4L,
1L, 3L, 6L, 5L, 2L), .Label = c("GREEN", "GROW", "LINDA", "MOM",
"SKY", "TOP"), class = "factor"), v3 = structure(c(3L, 1L, 5L,
4L, 2L, 6L), .Label = c("DAD", "GAG", "LOGAN", "LOOK", "SISTER",
"STAR"), class = "factor")), .Names = c("v1", "v2", "v3"), class = "data.frame", row.names = c(NA,
-6L))
d_with_NAs应如下所示:
d_with_NAs=d[apply(d,1,function(y) if(substring(d[y],1,1) != substring(d[y],3,3)){y=NA}),]
答案 0 :(得分:1)
试试这个:
aQueue
修改的
在 x <- c("DAD", "MOM", "BABOON", "SISTER", "CAT", "STEP", "JULIAN")
ind <- substr(x, 1, 1) != substr(x, 3, 3)
x[ind] <- NA
x
#[1] "DAD" "MOM" "BABOON" "SISTER" NA NA NA
data.frame
甚至更简洁,没有类型转换:
as.data.frame(apply(dat, 2, FUN = function(x){
tmp <- rep(NA, length(x))
ind <- substr(x, 1, 1) == substr(x, 3, 3)
tmp[ind] <- x[ind]
tmp
})
)
# v1 v2 v3
#1 <NA> MOM <NA>
#2 BOB <NA> DAD
#3 BABOON <NA> SISTER
#4 <NA> <NA> <NA>
#5 <NA> <NA> GAG
#6 <NA> <NA> <NA>
答案 1 :(得分:1)
只需将stas g的解决方案应用于data.frame的行或列:
x <- c("DAD", "MOM", "BABOON", "SISTER", "CAT", "STEP", "JULIAN")
y <- c("BOB", "TITLES", "CACAO", "PREGNANT", "FLIP", "TRINIAN", "COILSPRING")
df <- data.frame(x = x, y = y)
newdf = apply (df, 2, function(x){
# this bit exactly what stas g said
ind <- substr(x, 1, 1) != substr(x, 3, 3)
x[ind] <- NA
return(x)
})
newdf
答案 2 :(得分:1)
如果您未与data.frame
对象结婚,则可以使用matrix
个对象和substr
完成此操作。
mat <- as.matrix(df)
idx <- which(substr(mat, 1, 1) != substr(mat, 3, 3))
mat[idx] <- NA
mat
v1 v2 v3
[1,] NA "MOM" NA
[2,] "BOB" NA "DAD"
[3,] "BABOON" NA "SISTER"
[4,] NA NA NA
[5,] NA NA "GAG"
[6,] NA NA NA
如果您愿意,可以将其转换回data.frame
。