代码中可能的内存错误和可能的解决方案?

时间:2014-04-02 07:24:20

标签: c regex valgrind

下面给出的源代码是一些更详细的C源代码的简化版本,它解析输入字符串以查看它们是否与预定模式匹配。代码尝试解析输入字符串(您可以假设它是一个有效的以空字符结尾的字符串)。如果 该字符串包含一个有效的无符号整数,该函数返回0,否则返回错误-1。无符号整数与正则表达式[0-9] + $匹配。

我试图运行valgrind命令来找出显示以下输出的可能错误(我无法理解)。

==15269== 
==15269== Invalid read of size 1
==15269==    at 0x400770: parse_exact (assign2b.c:23)
==15269==    by 0x400957: xtz_parse_unsigned (assign2b.c:82)
==15269==    by 0x400A26: test_parse_unsigned (assign2b.c:102)
==15269==    by 0x400B06: main (assign2b.c:128)
==15269==  Address 0x51f2045 is 0 bytes after a block of size 5 alloc'd
==15269==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15269==    by 0x4EBAD81: strdup (strdup.c:43)
==15269==    by 0x400AF1: main (assign2b.c:127)
==15269== 
==15269== Invalid read of size 1
==15269==    at 0x400770: parse_exact (assign2b.c:23)
==15269==    by 0x400957: xtz_parse_unsigned (assign2b.c:82)
==15269==    by 0x400A26: test_parse_unsigned (assign2b.c:102)
==15269==    by 0x400B9B: main (assign2b.c:142)
==15269==  Address 0x51f2135 is 0 bytes after a block of size 5 alloc'd
==15269==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15269==    by 0x400B72: main (assign2b.c:140)

以下是代码。请告诉代码中的错误和可能的解决方案,以及如何使用valgrind

推断出相同的错误
    #include <stdio.h>
    #include <ctype.h>
    #include <assert.h>
    #include <stdlib.h>
    #include <string.h>

    #define OK 9999
    #define EOS '\0'
    #define XT_SUCCESS 0
    #define XT_FAIL -1

    typedef int (*PARSE_FUNC)(const char *s, const char **endptr);

    static int parse_exact(const char *s, const char **endptr, PARSE_FUNC pfunc)
    {
        const char *cp = s;
        int c;
        int state = 1;
        while (state != XT_SUCCESS && state != XT_FAIL)
        {
            c = *cp++;  // nextchar
            switch(state)
            {
            case 1:
                state = pfunc(--cp, endptr);
                cp = *endptr;
                if (state == XT_SUCCESS) state = 2;
                else cp++;  // on FAIL jump ahead to get undone on exit
                break;
            case 2:
                if (EOS == c) state = OK;
                else state = XT_FAIL;
                break;
            case OK:
                state = XT_SUCCESS;
                break;
            default:
                /* LOGIC ERROR */
                assert(0==1);
                break;
            }
        }
        if (endptr) 
            *endptr = --cp;
        return state;
    }

    static int base_unsigned(const char *s, const char **endptr)
    {
        const char *cp = s;
        int c;
        int state = 1;
        while (state != XT_SUCCESS && state != XT_FAIL)
        {
            c = *cp++;  // getnextchar
            switch(state)
            {
            case 1:
                if (isdigit(c)) state = 2;
                else state = XT_FAIL;
                break;
            case 2:
                if (isdigit(c)) state = 2;
                else state = XT_SUCCESS;
                break;
            default:
                /* LOGIC ERROR */
                assert(0==1);
                break;
            }
        }
        if (endptr) 
            *endptr = --cp;
        return state;
    }

    int xtz_parse_unsigned(const char *s, const char **endptr)
    {
        PARSE_FUNC pfunc = base_unsigned;
        return parse_exact(s, endptr, pfunc);
    }

    void xt_pr_error(int status, const char *s, const char *endptr)
    {
        if (0 != status)
        {
            if (endptr[0])
                printf("ERROR: '%c' at position %d is not allowed", *endptr, (endptr - s)+1);
            else if ((endptr - s) > 0)
                printf("ERROR: cannot end with '%c'", endptr[-1]);
            else
                printf("ERROR: value is empty");
        }
    }

    void test_parse_unsigned(const char *s, int expected)
    {
        int status;
        const char *endptr; // Ptr to first invalid character
        status = xtz_parse_unsigned(s, &endptr);
        printf("Test input='%s' status=%d ", s, status);
        xt_pr_error(status, s, endptr);
        if (status != expected)
            printf(" NOT EXPECTED!\n");
        else
            printf(" (OK)\n");
    }


    int main(void)
    {
        char s1234[] = "1234";
        char s12a4[] = "12a4";
        char *ptr;

        // Tests with string literals
        test_parse_unsigned("1234", XT_SUCCESS);
        test_parse_unsigned("12a4", XT_FAIL);

        // Tests with static strings arrays
        test_parse_unsigned(s1234, XT_SUCCESS);
        test_parse_unsigned(s12a4, XT_FAIL);

        // Tests using strdup()
        ptr = strdup("1234");
        test_parse_unsigned(ptr, XT_SUCCESS);
        free(ptr);

        ptr = strdup("123a");
        test_parse_unsigned(ptr, XT_FAIL);
        free(ptr);

        ptr = strdup("1a34");
        test_parse_unsigned(ptr, XT_FAIL);
        free(ptr);

        // Test using malloc and strcpy()
        ptr = malloc(5);
        strcpy(ptr, "1234");
        test_parse_unsigned(ptr, XT_SUCCESS);
        free(ptr);

        return 0;
    }

2 个答案:

答案 0 :(得分:1)

很难从代码中判断出真正的错误,你必须使用调试器来完成它。但从它的外观来看,读取超过1个字节,你的字符串没有正确地终止,或者你没有很好地处理这个条件。

valgrind指向的函数有点难以理解,因为对于字符串的结尾没有明确的条件,即c'\0'时。

此外:

*cp++之类的东西属于博物馆,不要使用表达式来表示副作用。除了while循环之外,您可以轻松地将for循环与cp作为迭代变量

for (const char *cp = s;
     state != XT_SUCCESS && state != XT_FAIL;
     ++cp) {
    ...
}

你使用你的状态变量和命名的常量和数字的混合物对于其他人来说是疯狂的,如果你在一周后回来的话,对你自己来说是不可读的

答案 1 :(得分:0)

在功能parse_exact中,您正在读取EOS之外的一个位置。详细说明,当您到达输入字符串的末尾时会发生这种情况:

c = *cp++;  // nextchar

读取NUL字符(EOS)。

if (EOS == c) state = OK;

因为状态既不是XT_SUCCESS也不是XT_FAIL,因此通过循环进行另一次传递。

c = *cp++;  // nextchar

读取EOS以外的字符。在某些系统上,这是可以接受的,但是通过严格的边界检查,它不是。在您的情况下,会发生错误。

case OK:
    state = XT_SUCCESS;

最终,state将成为XT_SUCCESS。让我想知道为什么你有这种中间状态OK

我建议您放弃OK,并在此行代码中将其替换为XT_SUCCESS

if (EOS == c) state = OK;