正则表达式匹配和替换pvalue字符串

时间:2015-03-30 10:50:37

标签: regex r

希望这是有道理的。

我有一串不同长度的pvalues(由于舍入),其中非常大和小的pvalues分别存储为1.0(some number of 0s corresponding to the length of rounding)0.0(some number of 0s corresponding to the length of rounding)

我想匹配两组模式:

第一: "(1.)(string of zeros of any length)" 并将其更改为"> 0.(sting of nines the same length as the string of zeros)"

"(0.)(string of zeros of any length)" 并将其更改为"< 0.(string of zeros the length of the input minus one)1

因此,如果我们有以下输入:

pvals<-c("1.000","1.00","0.00000","0.123","0.6","0.0")

我希望能回来:

> expectedOutput
[1] "> 0.999"   "> 0.99"    "< 0.00001" "0.123"     "0.6"       "< 0.1" 

我一直在尝试使用gsub,但我对正则表达式的更复杂使用知之甚少,我不明白如何允许任何长度的某个字符(0),然后如何替换为相同数量的新字符(在1.0s的情况下),或该数字减去1(在0.0s的情况下)

非常感谢任何帮助! 谢谢

2 个答案:

答案 0 :(得分:1)

你可以这样做,

> pvals<-c("1.000","1.00","0.00000","0.123","0.6","0.0")
> x <- gsub("(?:^1\\.|\\G)\\K0(?=0*$)", "9", pvals, perl=T)
> m <- gsub("^1\\.", "> 0.", x)
> gsub("^(0\\.0*)0$", "< \\11", m)
[1] "> 0.999"   "> 0.99"    "< 0.00001" "0.123"    
[5] "0.6"       "< 0.1"   

答案 1 :(得分:1)

您也可以使用 gsubfn 包来执行此操作。

pvals <- c('1.000', '1.00', '0.00000', '0.123', '0.6', '0.0')

f <- proto(fun = function(this, x, y) 
   if (x==1) paste0('> 0.', paste(rep(9, nchar(y)), collapse = '')) 
   else paste0('< 0.', paste(rep(0, nchar(y)-1), collapse = ''), 1))

gsubfn('([01])\\.(0+)', f, pvals)
# [1] "> 0.999"   "> 0.99"    "< 0.00001" "0.123"     "0.6"       "< 0.1"