使用regex.find(input,pos)
时,我可以让kotlin将pos
视为行的开头吗?
即:
val s = "foo(2)"
/*let's say I already extracted "foo"
and now want to extract tokens '(', '2' and ')'
*/
val r1a = "\\(".toRegex()
val r1b = "\\)".toRegex()
println(r1a.find(s,3)?.let{"found '${it.value}'"} ?: "Nothing found")
println(r1b.find(s,3)?.let{"found '${it.value}'"} ?: "Nothing found")
println()
//this finds both
//but I only want to find '(' because it's at the beginning of the remaining string
val r2a = "^\\(".toRegex()
val r2b = "^\\)".toRegex()
println(r2a.find(s,3)?.let{"found '${it.value}'"} ?: "Nothing found")
println(r2b.find(s,3)?.let{"found '${it.value}'"} ?: "Nothing found")
println()
//this finds neither.
//I want the following behaviour:
val ss = s.substring(3)
println(r2a.find(ss,0)?.let{"found '${it.value}'"} ?: "Nothing found")
println(r2b.find(ss,0)?.let{"found '${it.value}'"} ?: "Nothing found")
println()
/*which finds '(' but not ')',
but without having to explicitly split the string
*/
有没有办法做到这一点?
编辑
我不想要匹配“ foo(2)”。
我希望能够将此字符串输入匹配项列表,该列表将首先匹配foo
然后匹配(
然后匹配2
然后匹配)
。
fun tokenizeLine(line:String){
var pos = 0
while(pos < line.length){
val result = nextToken(line,pos)
pos += result.consumed
result.token?.let { tokens.add(it) }
}
tokens.add(Token.EOL)
}
每个匹配器返回其中一个
sealed class TokenizerResult(val consumed : Int, val token:Token?){
class Something(consumed:Int, token:Token):TokenizerResult(consumed,token)
class Skip(consumed:Int=0):TokenizerResult(consumed,null)
object Nothing:TokenizerResult(0,null)
}
和fun nextToken(input:String, pos:Int) : TokenizerResult
遍历匹配器列表,直到耗尽匹配器以尝试 或其中一个匹配器返回的内容不是TokenizerResult.Nothing
。
val matchers = listOf( skipWhitespace, number, parensOpen, parensClose, identifier, ... )
for(m in matchers){
result = m(input,pos)
if(result != TNothing) break
}
if(result == TNothing){
...
}
return result
编辑2
匹配器通常是这样的:
class RawMatch(val regex:Regex) : Pattern{
override fun match(input: String, pos: Int, createToken: (value: String) -> Token): TokenizerResult {
return regex.find(input,pos)?.let { TSomething(it.value.length,createToken(it.value)) } ?: TNothing
}
}
答案 0 :(得分:0)
如果您想查找括号中任何内容的值,则可以与find
和groupValues
一起使用不同的正则表达式:
val str = "foo(2)"
val regex = "\\s*\\((\\d*)\\)".toRegex()
println(regex.find(str)?.groupValues?.last())
正则表达式会在前面查找任何字符串值,然后在括号内查找数字。数字本身按一组括号分组,可通过groupValues
变量将其取出。没有字符串转义,正则表达式是这样的:
\s*\((\d*)\)