我有以下代码。我想找到具有字母数字值的单元格,并且还应该忽略na或NA的单元格。
如何将代码修改为?所需的R命令应返回newcolumn
的结果真,真,假,假,真,假,假
我尝试了命令3和4,但是他们失败了:(
> newcolumn=c(1,2,"na","NA","abc","","*")
> grepl("[[:alnum:]]", newcolumn)
[1] TRUE TRUE TRUE TRUE TRUE FALSE FALSE
> grepl("[[:alnum:]] | na", newcolumn)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> grepl(c("[[:alnum:]]","na"), newcolumn)
[1] TRUE TRUE TRUE TRUE TRUE FALSE FALSE
Warning message:
In grepl(c("[[:alnum:]]", "na"), newcolumn) :
argument 'pattern' has length > 1 and only the first element will be used
> grepl("[[:alnum:]]" | "na" | "NA", newcolumn)
Error in "[[:alnum:]]" | "na" :
operations are possible only for numeric, logical or complex types
> str(newcolumn)
chr [1:7] "1" "2" "na" "NA" "abc" "" "*"
=========================== UPDATE1 =================== ============
newcolumn2<-newcolumn[grepl("(?=(?i)na(N)?(*SKIP)(*F))|[[:alnum:]]|(?=(?i)nan(*SKIP)(*F))|(?=(?i)null(*SKIP)(*F))", newcolumn, perl=TRUE)]
我更新了上面的代码,因为我想识别na,nan,null及其变体。但是&#34; null部分无效。我应该做些什么改变?
答案 0 :(得分:1)
尝试:
grepl("(?=(?i)na(*SKIP)(*F))|[[:alnum:]]", newcolumn, perl=TRUE)
#[1] TRUE TRUE FALSE FALSE TRUE FALSE FALSE
(?i)
表示不区分大小写。因此,它应匹配na
,NA
,nA
或Na
。模式中的(*SKIP)(*F)
使匹配失败。现在|
符号右侧的模式即。 [[:alnum:]]
将是匹配的那个。
newcolumn <- c(1,2,"na","NA","abc","","*", "NaN", "nan", "nAn")
grepl("(?i)na(N)?(*SKIP)(*F)|[[:alnum:]]", newcolumn, perl=TRUE)
# [1] TRUE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE