如何制作正则表达式的字符串函数?

时间:2014-11-22 00:31:23

标签: c regex substring match

问题:

看看我的“扫描”功能。这就是决定字符串匹配的位置。 (顺便说一下,这是EEL代码,它是一种非常类似C的语言)。我将如何开始添加类似正则表达式的匹配?我的意图是一个正则表达式的功能,因为我不想或不需要完全复制正则表达式行为。具体来说,我想窃取正则表达式的\ d功能,基本上是一个字符类,意思是“匹配从'0'到'9'的任何字符。

示例:“abc \ d \ d \ d”匹配“abc 123”;

欢迎任何解释,开始添加此类功能的有用提示!

详情:

这是我的“双向字符串扫描程序”函数,它输出一个子字符串。它是双向的,因为你必须指定两个匹配m1 m2,两个“你想在停止之前匹配多少次?” (称为ntimes1 ntimes2)起始位置p1,当然还有要扫描的字符串(大海捞针)。

它可以向前移动'ff' 向后倒退'bb' 向后转发'bf' 并向后转发'fb'。

例如,假设您希望第二次出现“abc”之前的数字:

“ABC,1,DEF,4,GHI,5,ABC,2,DEF,6,GHI,3”

从0开始,向前移动并匹配“abc”两次,向后移动:匹配“,”两次

substring =“,5,abc”(当前子字符串包含所有匹配字符串)

,效果很好!

CODE:

function scan(match,str,p,D,ntimes,mlen,hlen) local(lastp found break restart) (
  D == -1 ? (m=restart=mlen-1; end=-1) : (m=restart=0; end=mlen);
  found=break=0;
  while(p > -1 && p < hlen && !break) (
    //ShowConsoleMsg(sprintf(#, "%i:%i ->  ' %c ' = ' %c '\n", m,p,str_getchar(match, m),str_getchar(str, p)));
    str_getchar(match, m) == str_getchar(str, p) ? m+=D : m=restart;    
    m == end ? (found+=1; m=restart; lastp=p);
    found == ntimes ? break=1 : p+=D;
  );
  ntimes == -1 ? lastp : p;
);

function findpos(m1,m2,DIR,p1,ntimes1,ntimes2) local(adj start p1 p2 len1 len2 dir1 dir2 hlen) (
  hlen = strlen(this);
  len1 = strlen(m1);
  len2 = strlen(m2);
  ntimes1 < 1 ? ntimes1 = -1;
  ntimes2 < 1 ? ntimes2 = -1;

  DIR == 'ff' ? (dir1 =     dir2 = +1; adj=1):
  DIR == 'fb' ? (dir1 = +1; dir2 = -1; adj=1):
  DIR == 'bf' ? (dir1 = -1; dir2 = +1; adj=1):
                (dir1 =     dir2 = -1; adj=len1);

  p1 = scan(m1,this,p1,dir1,ntimes1,len1,hlen);

  DIR == 'ff' ? (p2=p1+len1+dir1; p1+=dir1-len1):
  DIR == 'bf' ? (p2=p1+len1):
                (p2=p1-len1);
  //ShowConsoleMsg(sprintf(#, "%i---------%i\n", p1,p2));
  p2 = scan(m2,this,p2,dir2,ntimes2,len2,hlen);

  dir2 == 1 ?    (p2+=len; this.pos1 = p1; this.pos2 = p2):
  DIR  == 'fb' ? (p2+=len; this.pos1 = p2; this.pos2 = p1):
                          (this.pos1 = p2; this.pos2 = p1);
  //ShowConsoleMsg(sprintf(#, "%i---------%i\n", p1,p2));
  this.substrlen = this.pos2-this.pos1+adj;  
  strcpy_substr(#, this, this.pos1, this.substrlen);
);

用法示例:

ShowConsoleMsg("");
string  = "C-2=0,C#-2=1,D-2=2,D#-2=3,";

ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",",",",'ff',8,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",",",",'fb',15,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",",",",'bf',15,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",",",",'bb',21,1,1)) );

ShowConsoleMsg( sprintf(#, "%s\n",string.findpos("1,",",D",'ff',8,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",D","1,",'fb',15,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos("1,",",D",'bf',15,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",D","1,",'bb',21,1,1)) );

ShowConsoleMsg( sprintf(#, "%s\n",string.findpos("1,",",",'ff',8,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",","1,",'fb',15,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos("1,",",",'bf',15,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",","1,",'bb',21,1,1)) );

ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",",",D",'ff',8,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",D",",",'bb',21,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",",",D",'bf',16,1,1)) );
ShowConsoleMsg( sprintf(#, "%s\n",string.findpos(",D",",",'fb',15,1,1)) );

输出:

,D-2=2,
,D-2=2,
,D-2=2,
,D-2=2,
1,D-2=2,D
1,D-2=2,D
1,D-2=2,D
1,D-2=2,D
1,D-2=2,
1,D-2=2,
1,D-2=2,
1,D-2=2,
,D-2=2,D
,D-2=2,D
,D-2=2,D
,D-2=2,D

1 个答案:

答案 0 :(得分:1)

我实际上没有运行你的代码但是快速查看,我想你想改变这一行:

str_getchar(match, m) == str_getchar(str, p) ? m+=D : m=restart; 

现在,它会查找字符的完全匹配。你改变它,所以如果搜索字符串是魔术字符类字符,而不是检查确切的事物,它会检查一个范围。

所以喜欢

char c = str_getchar(match, m);
if(c == magic_thing) {
   char n = str_getchar(str, p);
   (n >= '0' && n <= '9') ? m+=D : m=restart;
else
   c == str_getchar(str, p) ? m +=D : m=restart;
}

在那里。要执行magic_thing,您可以修改str_getchar(或为其编写包装器),以检查\。如果它在那里,也请阅读下一个字符。如果是'd',则返回魔法能指。否则,返回角色本身。您可能会有所不同,但通常我会想要将\ d识别放在读取字符串函数中,而不是试图将其作为扫描函数本身的一部分。