C - 递归函数,排列,堆空间和神秘错误

时间:2012-06-18 06:12:01

标签: c recursion stack permutation

感谢您提前花时间阅读我的帖子。我将是第一个承认这段代码有点令人费解的人。

本周末,我的项目一直在制作一个程序,它将格式化的字符串作为参数,并吐出该字符串的所有可能的排列。格式化的字符串看起来类似于printf格式的字符串。常量保存在字符串中,变量部分如下所示:%[number-of-chars] [permuation类型]其中

[type-of-permutation]对于整数可以是'd',对于大写字母字符可以是'u',对于小写字母可以是'l',对于所有字母字符可以是'a',对于字母数字可以是'n'。

有一个特殊情况“%w”,它从单词列表中读取单词,而不是递增每个单独的字符。

事情通常按我希望的方式运作。但是,在使用gdb进行测试和调试时,我发现了两个我无法解释的错误。

错误#1:字符串partial最初填充了该类型的最小字符(例如,%5d将填充“00000”)但是,在执行任意命令(fopen)之后,部分包含任意字符,通常值低于0x09。这发生在我在源中标记的第143行。

Bug#2:当我运行这样的程序时:“。/ Permute%w%2d -w words”,其中“words”是Ubuntu单词列表,程序在到达4,200th附近后抛出分段错误单词列表的行。这也令人困惑。因为数字4,200是如此随机,我只能假设程序耗尽堆栈内存或类似的东西。

我将此代码分解为多个头文件。我将首先发布主代码,然后是下面包含的头文件。

#include <stdio.h>
#include "../Libraries/permute.h"
#include <malloc.h>
#include <string.h>


void main (int argc, char * argv[]) {

  //Allocate ten base pointers to be inserted into our master string.
  struct base * part[10];

  //Allocate a character string on the heap to be manipulated by our program.
  char * master = malloc(100);

  //Organize arguments
  if(argc < 2) {
      printf("Usage:\n./Permute [Format String] -p -w [word-list] -o [output-file]\n");
      printf("Format string with the general form %%[number-of-chars][type]\n");
      printf("Where [number-of-chars] is between 1 and 9\n");
      printf("And [type] is d for integer, a for ALPHA, n for ALPHA_NUMERIC, u for UPPERCASE, l for LOWERCASE\n");
      printf("-p increments all parts in parallel. \n");    
      printf("-w Reads words for %%w from [word-list]\n");
      return;
  }

    int count;
    int parallel = 0;
    char * wordlist = NULL;
    FILE * file = NULL;
    char * output = NULL;
    FILE * out = NULL;
    for(count = 1; count < argc; count++){
      char first = *argv[count];
      int second = *(argv[count] + 1);  
      if (first == '-') {
        switch (second) {
            case 'p':
                parallel = 1;
                break;
            case 'w':
                wordlist = argv[count+1];
                break;
            case 'o':
                output = argv[count+1];
                break;
        }
    }
}

  //Parse format string
  int i = 0;
//This is the counter for the current base struct we're operating on.
int x;
int size;
int type;
//Create a pointer that points to a position inside our master string.
char * current = master;
//Cycle through each character in the format string.
for(count = 0; count < strlen(argv[1]); count++) {

    //If the character is a constant, set the current character in the master string to the character in the format string. 
    if (*(argv[1] + count) != '%') {
        *current = *(argv[1] + count);
        current++;
    }

    //Otherwise, initialize a base structure using one of the pointers in the array declared above.
    //Fill the structure according to the format string with rules specified above.
    else {
        size = *(argv[1] + count + 1) - '0';

        //If the base part is a wordlist, do special things.
        if (*(argv[1] + count + 1) == WORDLIST) {
            type = WORDLIST;
            part[i] = malloc(sizeof(int) * 3 + sizeof(FILE *) + MAX_WORD_SIZE);
            memset(part[i], 0, sizeof(int) * 3 + sizeof(FILE *) + MAX_WORD_SIZE);
            part[i]->file = fopen(wordlist, "r");
            if (part[i]->file == NULL)
                perror("Could not open file");
            fgets(part[i]->guts, MAX_WORD_SIZE, part[i]->file);
            part[i]->size = StripChar(part[i]->guts, strlen(part[i]->guts), '\n');
        }

        //Otherwise, fill out the base structure like normal.
        else {
            type = (int)*(argv[1] + count + 2);
            part[i] = malloc(sizeof(int) * 3 + sizeof(FILE *) + size);          
            part[i]->size = size;
        }
        part[i]->type = type;
        part[i]->offset = (int)(current - master);

        //If the base part is not a wordlist (its size does not vary), move the position in the master string forward
        //based on the size of the part. Move the position in the format string forward 2 places.
        if (type != WORDLIST) {
            current += size;
            count += 2;
        }

        //If the base part is a wordlist (meaning its size varies), keep the pointer inside the master string where it is.
        //We will move things around later to make everything fit.
        else {
            count += 1;
        }

        //Set all characters inside the base part to their minimum value.
        for (x = 0; x < size; x++) {
            switch (type) {
                case NUMERIC:
                    part[i]->guts[x] = '0';
                    break;              
                case ALPHA:
                    part[i]->guts[x] = 'A';
                    break;
                case ALPHA_NUMERIC:
                    part[i]->guts[x] = '0';
                    break;
                case UPPERCASE:
                    part[i]->guts[x] = 'A';
                    break;
                case LOWERCASE:
                    part[i]->guts[x] = 'a';
                    break;
            }
        }

        //Move on to the next base structure.
        i++;
    }
}
//Terminate the masterstring with a null character.
*current = '\0';
//Calculate the size of the master string. (Without wordlist parts)
size_t mastersize = (size_t)(current - master);
//Keep track of the number of characters we've inserted into the master string for each wordlist part.
int wordsize = 0;
//Record the number of parts to cycle through in the master string
int max = i;

if (output != NULL) {
    out = fopen(output, "w+");
}
/*BUG #1: THIS IS WHERE THE STRING GETS CHANGED. NO IDEA WHY. part[i]->guts was something like "AAAA", now it's something like 0x0002000200020002*/

while (1) { 
    wordsize = 0;

    //If parallel flag is set, increment all parts at once.
    if (parallel) {
        for (i = 0; i < max; i++) {
            Increment(part[i]->guts, part[i]->size, 0, part[i]->type);

            //Copy the incremented part into the master string at offset
            strncpy(master + part[i]->offset, part[i]->guts, part[i]->size);
        }

        //Then, once all parts are copied into the master, insert the words from the wordlist into the master string,
        //moving the rest of the string around as needed.
        for (i = 0; i < max; i++) {
            if (part[i]->type == WORDLIST) {
                memmove(master + part[i]->offset + part[i]->size + wordsize, master + part[i]->offset + wordsize, mastersize - part[i]->offset); 
                strncpy(master + part[i]->offset + wordsize, part[i]->guts, part[i]->size); 
                wordsize += part[i]->size;          
            }       
        }
    }

    //Otherwise, just increment the rightmost part.
    else {
        IncrementC(part, max - 1, wordlist);

        //Copy the incremented part into the master string at offset
        for (i = 0; i < max; i++) {
            if (part[i]->type != WORDLIST)
                strncpy(master + part[i]->offset, part[i]->guts, part[i]->size);        
        }

        //Then, once all parts are copied into the master, insert the words from the wordlist into the master string,
        //moving the rest of the string around as needed.       
        for (i = 0; i < max; i++) {
            if (part[i]->type == WORDLIST) {
                memmove(master + part[i]->offset + part[i]->size + wordsize, master + part[i]->offset + wordsize, mastersize - part[i]->offset); 
                strncpy(master + part[i]->offset + wordsize, part[i]->guts, part[i]->size); 
                wordsize += part[i]->size;          
            }       
        }
    }

    //Terminate the master string with a zero again (In case we overwrote it)
    *(master + mastersize + wordsize) = '\0';

    //Print the master string   
    if (out != NULL) {
        fwrite(master, mastersize + wordsize, 1, out);
        fwrite("\n", 1, 1, out);
    }
    printf("%s\n", master);
    for (i = 0; i < max; i++) {
        if (part[i]->type == WORDLIST) {
            memmove(master + part[i]->offset, master + part[i]->offset + part[i]->size, mastersize - part[i]->offset);      
        }       
    }
}
}

// PERMUTE.H

#define NUMERIC 100
#define ALPHA 97
#define ALPHA_NUMERIC 110
#define LOWERCASE 108
#define UPPERCASE 117
#define WORDLIST 119
#define MAX_WORD_SIZE 50
#include <stdio.h>
#include "files.h"

//Used for multipart strings
struct base {
    int type;   //Determines which set of values the character should cycle through.
    int offset; //The offset of the part from the beginning of the multipart string.
    int size;   //Size of the part.
    FILE * file;    //File pointer used for cycling through word lists
    char guts[];    //The actual part to be inserted into the string.
};

//Recursive function Increment takes a string of length len and increments the character found [offset] number of chars
//left of the end of the string. Type determines which values the character should cycle through. (A-Z, a-z, 0-9, etc.)
int Increment(char * start, size_t len, int offset, int type) {
    char * stop = start + len - 1;
    char * place = stop - offset;
    int min, max, count;

    //Setup break points to determine if incremented char is at the end of its cycle for a given type.
    switch (type) {
        case NUMERIC:
            min = (int)'0';
            max = (int)'9';
            break;
        case ALPHA:
            min = (int)'A';
            max = (int)'z';
            break;
        case ALPHA_NUMERIC:
            min = (int)'0';
            max = (int)'z';
            break;
        case UPPERCASE:
            min = (int)'A';
            max = (int)'Z';
            break;
        case LOWERCASE:
            min = (int)'a';
            max = (int)'z';
            break;
        default:
            return -1;
    }

    //If our specified character is not greater than its maximum value, increment the character based on
    //the type of incrementation specified.
    if (*place < max) {
        switch (type) {
            case NUMERIC:
                *place += 1;        
                return 1;
                break;
            case ALPHA:
                if (*place == 'Z')
                    *place = 'a';
                else
                    *place += 1;
                return 1;
                break;
            case ALPHA_NUMERIC:
                if (*place == '9')
                    *place = 'A';
                if (*place == 'Z')
                    *place = 'a';
                else
                    *place += 1;
                return 1;
                break;
            case UPPERCASE:
                *place += 1;        
                return 1;
                break;
            case LOWERCASE:
                *place += 1;        
                return 1;
                break;
            default:
                return -1;
        }
    }

    //If the character is greater than the maximum value, set it to it's minum and Increment() the next character
    //to the left. If there is no character to the left, set all characters to zero and return 0.
    else {
        if(place == start) {
            for (count = 0; start + count < stop; count++)
                *(start + count) = (char)min;
            *start = (char)min;
            return 0;
        }
        else {
            *place = (char)min;
            return Increment(start, len, offset + 1, type);
        }
    }
}

//IncrementC() stands for Increment Combined. Takes an array of pointers to base structures, an offset, and a string
//with the filename of a wordlist. 
int IncrementC (struct base * part[], int offset, char * wordlist) {        
    //If the type of permutation is anything but a wordlist.    
    if (part[offset]->type != WORDLIST) {   
        //Call Increment() on the base structure determined by offset.
        //If Increment() returns 0, the function calls itself on the next base structure to the left.       
        if ( Increment(part[offset]->guts, part[offset]->size, 0, part[offset]->type) == 0) {
            if (offset == 0)
                return 0;
            else
                IncrementC(part, offset - 1, wordlist);
        }
    }

    //If the type of permutation is a wordlist
    else {
        //Check to see if the base structure has an open file descriptor associated with it.
        //If not, open the file specified by wordlist.
        if (part[offset]->file == NULL) {
            if (wordlist != NULL)
                part[offset]->file = fopen(wordlist, "r");
            if (part[offset]->file == NULL)
                perror("Could not open word list");
        }

        //Get the next line from the wordlist file. If at EOF, reopen the file and get the first line of the new file descriptor.
        if(fgets(part[offset]->guts, MAX_WORD_SIZE, part[offset]->file) == NULL) {
            fclose(part[offset]->file);
            part[offset]->file = fopen(wordlist, "r");
            if (part[offset]->file == NULL)
                perror("Could not open word list");
            fgets(part[offset]->guts, MAX_WORD_SIZE, part[offset]->file);   

            //If the base structure is the farthest to the left in the string, return 0. Otherwise, call this
            //function on the base structure to the left.           
            if (offset == 0) {
                part[offset]->size = StripChar(part[offset]->guts, strlen(part[offset]->guts), '\n');
                return 0;
            }
            else {
                IncrementC(part, offset - 1, wordlist);
            }
        }
        //Strip the new line character from the string.
        part[offset]->size = StripChar(part[offset]->guts, strlen(part[offset]->guts), '\n');
    }   
}

// FILES.H

#include <stdio.h>
#include <malloc.h>
#include <string.h>

int ReadFile(char * filename, char * output) {
  //Open file and store in packet
  FILE *input;
  input = fopen(filename, "r");
  int count = 0;
  char c = fgetc(input);
  for(count = 0; c != '\0' && c != EOF; c = fgetc(input)) {
      *(char *)(output + count) = c;
      count++;
}
  fclose(input);

  printf("Read: %s from %s\n", output, filename);
  return (count - 1);
}

  int StripChar (char * in, size_t len, char strip) {
  int count;
  int outcount = 0;
  char * out = malloc(len);

  for (count = 0; count < len; count++) {
      if(*(in + count) != strip) {
        *(out + outcount) = *(in + count);
        outcount++;
    }
}

  *(out + outcount) = '\0';

  strcpy(in, out);

  return outcount;

}

//结束

1 个答案:

答案 0 :(得分:5)

如果我发现,我会继续编辑并添加更多错误。

首先,有一个错误:

  1. argv是字符串:argv是一维字符串数组。 你不能将指针直接类型转换为int。如果这样做,您只会将该字符串的第一个字符转换为int

    int second = *(argv[count] + 1);  
    
  2. 您必须使用atoi种功能将整数字符串参数转换为整数。

       int seecond = atoi ( argv[count] );  
    

    2。您的代码中绝对有 no或极少错误处理 ,这会导致很多问题。以下是堆栈损坏

    的可能根本原因

    在函数中的Permute.h

    int Increment(char * start, size_t len, int offset, int type){
    char * stop = start + len - 1;
    char * place = stop - offset;
    <snip..>
    

    如果(stop - offset)<您的start指针,则表示可能存在内存损坏。

    请记住规则:You can dodge a million bullets. But it takes just one !。编写程序的所有辛苦工作和努力如果崩溃都会被浪费掉。当你进入一个行业环境时,代码的质量很重要,而不仅仅是功能。只有通过代码中的更多错误检查才能改善质量。

    我非常重视错误检查,因为作为开发人员,您的印象是基于代码的崩溃频率。

    3。您需要确保适当地分配内存。

    //Allocate a character string on the heap to be manipulated by our program.
    char * master = malloc(100);
    

    在这种情况下,100是什么意思?它们应该是100个字节吗?还是100 ints

    适当的分配是

    char * master = malloc(100 * sizeof(char));
    

    4。您在malloc函数中main编了一堆内存。但你在哪里free呢?我没有看到free的一次通话?记住你从系统中获取的任何资源(文件,线程句柄,内存等),你必须返回。否则会导致资源泄漏......!