我想检查列表中的所有元素是否都有我需要的模式,否则我将停止整个脚本。
示例列表如下所示:
[1]
Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanobrevibacter;
[2]
Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanosphaera;
[3]
Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanosphaera;
[4]
Bacteria;Actinobacteria;Actinobacteria;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;
[5]
Bacteria;Actinobacteria;Actinobacteria;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;
[6]
Bacteria;Actinobacteria;Actinobacteria;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;
[7]
Bacteria;Actinobacteria;Actinobacteria;Coriobacteriales;Coriobacteriaceae;Gordonibacter;
[8]
Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Coriobacteriaceae;;
[9]
Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Coriobacteriaceae;;
希望我所有条目都有六个分号。 我尝试与grepl进行模式匹配,但是我遇到了正确模式的问题。 这是我试过的
if(!any(grepl(";{6}", taxonomy))) { Through error message if the
taxonomy is not in the right format stop("Wrong number of taxonomic
classes\n Taxonomic levels have to be separated by semicolons (six in
total). IMPORTANT: if taxonomic information at any level is missing,
the semicolons are still needed:\n
e.g.Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;Prevotella;
e.g.Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Prevotellaceae;;")
} else {
但我总是这样做。
答案 0 :(得分:2)
count.fields
使用sep
参数作为字段分隔符,返回作为第一个参数给出的文件或连接的每一行中的字段数。没有包使用。
f <- function(x) {
ok <- count.fields(textConnection(x), sep = ";") == 7
if (any(!ok)) stop("these row numbers do not have 7 fields: ", which(!ok))
# add whatever other code you need
}
测试出来:
# x has 2 components having 7 and 3 semicolon-separated fields respectively
x <- c("Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanobrevibacter;", ";;")
f(x)
## Error in f(x) : these row numbers do not have 7 fields: 2
请参阅?count.fields
和?textConnection
。
答案 1 :(得分:1)
;{6}
匹配";;;;;;"
,没有别的。你想检查像
(?:[^;]*;){6}
匹配if(至少)6个分号出现在字符串中。
如果你需要声明你测试的每一行完全 6个分号,你需要更加具体:
^(?:[^;]*;){6}[^;]*$
其中^
和$
是字符串anchors的开头/结尾,[^;]*
是一个否定的character class,它匹配除分号以外的任意数量的字符。< / p>
R代码
> x<-c('Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanobrevibacter;',
'Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanosphaera;',
'Archaea;Euryarchaeota;Methanobacteria;Methanobacteriales;Methanobacteriaceae;Methanosphaera;',
'Bacteria;Actinobacteria;Actinobacteria;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;',
'Bacteria;Actinobacteria;Actinobacteria;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;',
'Bacteria;Actinobacteria;Actinobacteria;Bifidobacteriales;Bifidobacteriaceae;Bifidobacterium;',
'Bacteria;Actinobacteria;Actinobacteria;Coriobacteriales;Coriobacteriaceae;Gordonibacter;',
'Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Coriobacteriaceae;;',
'Bacteria;Actinobacteria;Coriobacteriia;Coriobacteriales;Coriobacteriaceae;;')
> grepl("^(?:[^;]*;){6}[^;]*$", x)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[9] TRUE
答案 2 :(得分:0)
使用library(stringr)
which(str_count(taxonomy, ';') == 6)
你可以做类似的事情,
grepl(6, str_count(taxonomy, ';'))
或
$(window).resize(function() {
$('body').prepend('<div>' + $(window).width() + '</div>');
});