如何编写正则表达式以删除某些特定标签后的句子?
例如我在richtextbox中的文字
a 00001740 0.125 0 able#1 (usually followed by `to') having the necessary means or skill or know-how or authority to do something; "able to swim"; "she was able to program her computer"; "we were at last able to buy a car"; "able to get a grant for the project"
a 00002098 0 0.75 unable#1 (usually followed by `to') not having the necessary means or skill or know-how; "unable to get to town without a car"; "unable to obtain funds"
a 00002312 0 0 dorsal#2 abaxial#1 facing away from the axis of an organ or organism; "the abaxial surface of a leaf is the underside or side facing away from the stem"
本文来自sentiwordnet。我想在第五个标签之后删除句子,比如单词能够#1句子应该被省略(即它的光泽度)然后在另一个单词无法#1之后它的光泽应该被省略。
它的正则表达式将删除sentiwordnet文本文件中单词的光泽度。有没有办法做到这一点,或者有人能为我做一点样本/无效吗?
输出应该是这样的:
a 00001740 0.125 0 able#1
a 00002098 0 0.75 unable#1
a 00002312 0 0 dorsal#2 abaxial#1
答案 0 :(得分:0)
你可以改为寻找#后跟数字......所以正则表达式是
(?<=#\d+)[^#]*$
除了#之外, [^#]*
会匹配0到多个字符
(?<=#\d+)
会在匹配[^#]*
$
描述了字符串
或强>
\t[^\t]+$
您可以使用正则表达式的替换功能
input=Regex.Replace(input,regex,"");
答案 1 :(得分:0)
这应该做的工作
string text = @"a 00001740 0.125 0 able#1 (usually followed by `to') having the necessary means or skill or know-how or... ";
string res = Regex.Replace(text, @"((?:[^\t]+\t){5}).+$", "$1");