如何测试字符串是否遵循C中的给定模式?

时间:2014-01-03 20:28:44

标签: c regex

我想检查字符串是否遵循某种模式。我试过了sscanf,但我没有得到理想的结果。

模式很简单:它包含:

  • 字符串“while”后跟
  • 一个或多个空格,然后是
  • 由字母字符或下划线字符组成的字符串,后跟
  • 零个或多个空格,然后是
  • 冒号(':'),然后是
  • 换行符('\n')

模式示例:

  • while condition_a:
  • while test_b :

我尝试了以下操作,但它不会检查列:

sscanf(string, "while %[a-z,_]s %[:]c", test, column);

你有什么建议吗?

3 个答案:

答案 0 :(得分:6)

似乎很容易实现。你不需要不直观和古怪的scanf(),也不需要不可移植的(而且,坦率地,可怕的)正则表达式:

int isValid(const char *s)
{
    // the string "while" followed by
    if (memcmp(s, "while", 5))
        return 0;

    s += 5;

    // one or more spaces, followed by
    if (!isspace(*s))
        return 0;

    while (isspace(*++s))
        ;

    // a string made of alpha characters or the underscore character,
    // (I assumed zero or more)
    while (isalpha(*s) || *s == '_')
        s++;

    // followed by zero or more spaces
    while (isspace(*s))
        s++;

    // followed by a column (':'),
    if (*s++ != ':')
        return 0;

    // followed by the newline character ('\n')
    if (*s++ != '\n')
        return 0;

    // here should be the end
    return !*s;
}

答案 1 :(得分:1)

此模式的测试似乎有效:

   int n = 0;
   Bool ok = sscanf(string, "while%*[ ]%*[A-Za-z_] :%*1[\n]%n", &n) == 0 && 
      n && !string[n];

这很好而且简短,但有(至少)两个缺陷:

  • 很难看
  • 它允许冒号之前的任意空格,而不仅仅是空格(例如制表符,换行符)

sscanf中处理或更多空格的唯一方法是使用它两次,一次用于一个或多个 ,然后再次。例如,此代码:

   char tail[4] = "";
   Bool ok = (sscanf(string, "while%*[ ]%*[A-Za-z_]%*[ ]%3c", tail) == 1 || 
              sscanf(string, "while%*[ ]%*[A-Za-z_]%3c",      tail) == 1) && 
              !strcmp(tail, ":\n");

答案 2 :(得分:1)

正则表达式似乎是一个合理的工具:

#include <assert.h>
#include <regex.h>
#include <stdio.h>
#include <stdlib.h>

int main() {
    const char *expression = "^while +([a-zA-Z_]+) *:\n$";
    const char *input = NULL;
    regex_t regex;
    int rc;

    size_t nmatch = 2;
    regmatch_t pmatch[2];

    rc = regcomp(&regex, expression, REG_EXTENDED);
    assert(rc == 0);

    input = "while condition_a:\n";
    rc = regexec(&regex, input, nmatch, pmatch, 0);
    if(rc == 0) {
        printf("Match: %.*s\n", (int)(pmatch[1].rm_eo - pmatch[1].rm_so), input + pmatch[1].rm_so);
    } else if (rc == REG_NOMATCH) {
        printf("No match\n");
    } else {
        char msgbuf[64];
        regerror(rc, &regex, msgbuf, sizeof(msgbuf));
        printf("Regex match failed: %s\n", msgbuf);
    }

    input = "while test_b :\n";
    rc = regexec(&regex, input, nmatch, pmatch, 0);
    if(rc == 0) {
        printf("Match: %.*s\n", (int)(pmatch[1].rm_eo - pmatch[1].rm_so), input + pmatch[1].rm_so);
    } else if (rc == REG_NOMATCH) {
        printf("No match\n");
    } else {
        char msgbuf[64];
        regerror(rc, &regex, msgbuf, sizeof(msgbuf));
        printf("Regex match failed: %s\n", msgbuf);
    }

    regfree(&regex);
}

这将输出:

Match: condition_a
Match: test_b