我想输入一个字符串,然后想看看它是否与某个正则表达式匹配;如果不是我想继续使用另一个正则表达式,直到我的所有正则表达式都用完为止。例如,假设我有以下3个正则表达式
现在假设所需的字符串是:
- val str_input="7569"
我想首先用regex_1检查str_input;如果它不匹配则尝试使用regex_2;如果不匹配则最后尝试使用regex_3。 问题是如何将SMLNJ用于此目的。谢谢。
答案 0 :(得分:2)
您可以使用SML / NJ提供的正则表达式库来实现您想要的功能。其文档可在此处找到:http://www.smlnj.org/doc/smlnj-lib/Manual/regexp-lib-part.html
作为一个小小的入门示例,这是您需要做的事情。首先,您需要告诉SML / NJ您要使用regexp库。您可以使用.cm
文件完成此操作(cm来自编译管理器,它是SML / NJ的Makefile):
group is
$/basis.cm (* Load standard functions and modules. *)
$/regexp-lib.cm (* Load the regexp library. *)
main.sml (* Load our own source file. *)
现在我们可以使用regexp库了。不幸的是,它并不是很简单,因为它使用了仿函数和读者,但基本上,你需要的是RE.match
函数,它接受一对对的列表,其中第一个元素是正则表达式,第二个元素是匹配正则表达式时调用的函数。使用这个对列表,RE.match
函数将遍历输入字符串,直到找到匹配为止,此时它将调用与该点匹配的正则表达式相关联的函数。该函数的结果是整个RE.match
调用的结果。
structure Main =
struct
(**
* RE is a module created by calling the module-level function (functor)
* RegExpFn (Fn comes from functor), with two module arguments.
*
* The first argument, called P, is the syntax used to write regular
* expressions in. In this particular case, it's the Awk syntax, which
* is the only syntax provided by SML/NJ right now.
*
* The second argument, called E, is the RegExp engine used behind the
* scenes to compile and execute the syntax. In this particular case
* I've opted from ThompsonEngine, which implements Ken Thompson's
* matching algorithm. Other options are BackTrackEngine and DfaEngine.
*)
structure RE = RegExpFn(
structure P = AwkSyntax
structure E = ThompsonEngine
(* structure E = BackTrackEngine *)
(* structure E = DfaEngine *)
)
fun main () =
let
(**
* A list of (regexp, match function) pairs. The function called by
* RE.match is the one associated with the regexp that matched.
*
* The match parameter is described here:
* http://www.smlnj.org/doc/smlnj-lib/Manual/match-tree.html
*)
val regexes = [
("[a-zA-Z]*", fn match => ("1st", match)),
("[0-9]*", fn match => ("2nd", match)),
("1tom|2jerry", fn match => ("3rd", match))
]
val input = "7569"
in
(**
* StringCvt.scanString will traverse the `input` string and apply
* the result of `RE.match regexes` to each character in the string.
*
* It's sort of a streaming matching process. The end result, however,
* depends on your implementation above, in the match functions.
*)
StringCvt.scanString (RE.match regexes) input
end
end
您现在可以从命令行中使用它:
$ sml sources.cm
Standard ML of New Jersey v110.79 [built: Sun Jan 3 23:12:46 2016]
[scanning sources.cm]
[library $/regexp-lib.cm is stable]
[parsing (sources.cm):main.sml]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
- Main.main ();
[autoloading]
[autoloading done]
val it = SOME ("2nd",Match ({len=4,pos=0},[]))
: (string * StringCvt.cs Main.RE.match) option