Boyer-Moore算法

时间:2014-05-07 16:06:42

标签: c output string-search intrusion-detection boyer-moore

我正在尝试用C语言实现Boyer-Moore算法来搜索.pcap文件中的特定单词。我引用了http://ideone.com/FhJok5中的代码。我正在使用这个代码。

我正在将数据包作为字符串传递,而我正在搜索的关键字是函数search()。当我运行我的代码时,它每次都会给出不同的值。有时它也给出了正确的价值。但大多数时候它没有确定一些价值观。

我已经从Naive Algo Implementation获得了结果。结果总是很完美。

我在VMware 10.0.1上使用Ubuntu 12.0.4。郎:C

我的问题是每次都必须给出相同的结果吗?无论对错。每次在相同输入上运行文件时,此输出都会不断变化;在几次运行中,它也给出了正确的答案。大多数情况下,该值在3或4个值之间变化。

对于调试我到目前为止做了:

  1. 每次传递字符串而不是数据包,每次都是完美且相同且正确的值。
  2. 检查pcap部分,我可以看到所有数据包都被传递给该函数(我通过打印数据包帧检查没有)。
  3. 我发送给Naive Algo代码的相同数据包,它提供了完美的代码。
  4. 请给我一些想法,可能是什么问题。我怀疑内存管理有些问题。但如何找到哪一个?

    提前致谢。

    # include <limits.h>
    # include <string.h>
    # include <stdio.h>
    
    # define NO_OF_CHARS 256
    
    // A utility function to get maximum of two integers
    int max (int a, int b) { return (a > b)? a: b; }
    
    // The preprocessing function for Boyer Moore's bad character heuristic
    void badCharHeuristic( char *str, int size, int badchar[NO_OF_CHARS])
    {
        int i;
    
        // Initialize all occurrences as -1
        for (i = 0; i < NO_OF_CHARS; i++)
             badchar[i] = -1;
    
        // Fill the actual value of last occurrence of a character
        for (i = 0; i < size; i++)
             badchar[(int) str[i]] = i;
    }
    
    /* A pattern searching function that uses Bad Character Heuristic of
       Boyer Moore Algorithm */
    void search( char *txt,  char *pat)
    {
        int m = strlen(pat);
        int n = strlen(txt);
    
        int badchar[NO_OF_CHARS];
    
        /* Fill the bad character array by calling the preprocessing
           function badCharHeuristic() for given pattern */
        badCharHeuristic(pat, m, badchar);
    
        int s = 0;  // s is shift of the pattern with respect to text
        while(s <= (n - m))
        {
            int j = m-1;
    
            /* Keep reducing index j of pattern while characters of
               pattern and text are matching at this shift s */
            while(j >= 0 && pat[j] == txt[s+j])
                j--;
    
            /* If the pattern is present at current shift, then index j
               will become -1 after the above loop */
            if (j < 0)
            {
                printf("\n pattern occurs at shift = %d", s);
    
                /* Shift the pattern so that the next character in text
                   aligns with the last occurrence of it in pattern.
                   The condition s+m < n is necessary for the case when
                   pattern occurs at the end of text */
                s += (s+m < n)? m-badchar[txt[s+m]] : 1;
    
            }
    
            else
                /* Shift the pattern so that the bad character in text
                   aligns with the last occurrence of it in pattern. The
                   max function is used to make sure that we get a positive
                   shift. We may get a negative shift if the last occurrence
                   of bad character in pattern is on the right side of the
                   current character. */
                s += max(1, j - badchar[txt[s+j]]);
        }
    }
    
    /* Driver program to test above function */
    int main()
    {
        char txt[] = "ABAAAABAACD";
        char pat[] = "AA";
        search(txt, pat);
        return 0;
    

0 个答案:

没有答案