我有一些数据:
df <- data.frame(v1 = c('word',NA,'word','word',NA,'word','word',NA,'word','word'),
v1_open = c('word',NA,'word','word',NA,'word','word',NA,'word','word'),
v2 = c('word','word',NA,'word','word',NA,'word','word',NA,'word'),
v2_open = c('word','word',NA,'word','word',NA,'word','word',NA,'word'))
我正在使用apply将包含NA的观察值更改为包含1的其他观察值。
df <- t(apply(df,1,function(x){
ifelse(is.na(x) ,0,1)
}))
返回
v1 v1_open v2 v2_open
[1,] 1 1 1 1
[2,] 0 0 1 1
[3,] 1 1 0 0
[4,] 1 1 1 1
[5,] 0 0 1 1
[6,] 1 1 0 0
[7,] 1 1 1 1
[8,] 0 0 1 1
[9,] 1 1 0 0
[10,] 1 1 1 1
我想修改apply函数以排除名称中包含文本'_open'的列,从而导致:
v1 v1_open v2 v2_open
[1,] 1 word 1 word
[2,] 0 NA 1 word
[3,] 1 word 0 NA
[4,] 1 word 1 word
[5,] 0 NA 1 word
[6,] 1 word 0 NA
[7,] 1 word 1 word
[8,] 0 NA 1 word
[9,] 1 word 0 NA
[10,] 1 word 1 word
这怎么办?
答案 0 :(得分:3)
可以做到:
library(dplyr)
df %>%
mutate_at(vars(-contains("_open")),
~ +(!is.na(.)))
输出:
v1 v1_open v2 v2_open
1 1 word 1 word
2 0 <NA> 1 word
3 1 word 0 <NA>
4 1 word 1 word
5 0 <NA> 1 word
6 1 word 0 <NA>
7 1 word 1 word
8 0 <NA> 1 word
9 1 word 0 <NA>
10 1 word 1 word
答案 1 :(得分:1)
我们可以将is.na
直接应用于data.frame列的子集,而无需进行任何循环,然后更新列
nm1 <- grep("_open", names(df), value = TRUE, invert = TRUE)
df[nm1] <- +(!is.na(df[nm1]))
df
# v1 v1_open v2 v2_open
#1 1 word 1 word
#2 0 <NA> 1 word
#3 1 word 0 <NA>
#4 1 word 1 word
#5 0 <NA> 1 word
#6 1 word 0 <NA>
#7 1 word 1 word
#8 0 <NA> 1 word
#9 1 word 0 <NA>
#10 1 word 1 word
答案 2 :(得分:0)
如果您的列在.*
和.*_open
之间交替,那么您可以简单地通过TRUE, FALSE
将列子集化,即
df[c(TRUE, FALSE)] <- +(!is.na(df[c(TRUE, FALSE)]))
df
# v1 v1_open v2 v2_open
#1 1 word 1 word
#2 0 <NA> 1 word
#3 1 word 0 <NA>
#4 1 word 1 word
#5 0 <NA> 1 word
#6 1 word 0 <NA>
#7 1 word 1 word
#8 0 <NA> 1 word
#9 1 word 0 <NA>
#10 1 word 1 word