将可变长度字符串映射到int

时间:2013-07-11 03:45:30

标签: c string int arrays

我无法弄清楚如何编写一个接受以下输入并产生以下输出的函数:

in (int) | out (char *)
0        | ""
1        | "a"
2        | "b"
3        | "c"
4        | "aa"
5        | "ab"
6        | "ac"
7        | "ba"
8        | "bb"
...

不是简单地将输入转换为三元,因为存在差异“a”和“aa”(而000之间没有区别。)

当您仅使用len = floor(log2(in + 1))a时,我发现字符串的长度与输入(b之间存在关联:

in (int) | floor(log2(in + 1)) | out (char *)
0        | 0                   | ""
1        | 1                   | "a"
2        | 1                   | "b"
3        | 2                   | "aa"
4        | 2                   | "ab"
5        | 2                   | "ba"
6        | 2                   | "bb"
7        | 3                   | "aaa"
8        | 3                   | "aab"

如果有n个不同的有效字符,输出长度和输入值之间的一般相关性是什么?

4 个答案:

答案 0 :(得分:4)

这与Calc cell convertor in C有关但明显不同。这段代码很快就来自该代码:

#include <ctype.h>
#include <stdio.h>
#include <string.h>

/* These declarations should be in a header */
extern char     *b3_row_encode(unsigned row, char *buffer);
extern unsigned  b3_row_decode(const char *buffer);

static char *b3_encode(unsigned row, char *buffer)
{
    unsigned div = row / 3;
    unsigned rem = row % 3;
    if (div > 0)
        buffer = b3_encode(div-1, buffer);
    *buffer++ = rem + 'a';
    *buffer = '\0';
    return buffer;
}

char *b3_row_encode(unsigned row, char *buffer)
{
    if (row == 0)
    {
        *buffer = '\0';
        return buffer;
    }
    return(b3_encode(row-1, buffer));
}

unsigned b3_row_decode(const char *code)
{
    unsigned char c;
    unsigned r = 0;
    while ((c = *code++) != '\0')
    {
        if (!isalpha(c))
            break;
        c = tolower(c);
        r = r * 3 + c - 'a' + 1;
    }
    return r;
}

#ifdef TEST

static const struct
{
    unsigned col;
    char     cell[10];
} tests[] =
{
    {    0,      "" },
    {    1,     "a" },
    {    2,     "b" },
    {    3,     "c" },
    {    4,    "aa" },
    {    5,    "ab" },
    {    6,    "ac" },
    {    7,    "ba" },
    {    8,    "bb" },
    {    9,    "bc" },
    {   10,    "ca" },
    {   11,    "cb" },
    {   12,    "cc" },
    {   13,   "aaa" },
    {   14,   "aab" },
    {   16,   "aba" },
    {   22,   "baa" },
    {  169, "abcba" },
};
enum { NUM_TESTS = sizeof(tests) / sizeof(tests[0]) };

int main(void)
{
    int pass = 0;

    for (int i = 0; i < NUM_TESTS; i++)
    {
        char buffer[32];
        b3_row_encode(tests[i].col, buffer);
        unsigned n = b3_row_decode(buffer);
        const char *pf = "FAIL";

        if (strcmp(tests[i].cell, buffer) == 0 && n == tests[i].col)
        {
            pf = "PASS";
            pass++;
        }
        printf("%s: Col %3u, Cell (wanted: %-8s vs actual: %-8s) Col = %3u\n",
               pf, tests[i].col, tests[i].cell, buffer, n);
    }

    if (pass == NUM_TESTS)
        printf("== PASS == %d tests OK\n", pass);
    else
        printf("!! FAIL !! %d out of %d failed\n", (NUM_TESTS - pass), NUM_TESTS);

    return (pass == NUM_TESTS) ? 0 : 1;
}

#endif /* TEST */

代码包括一个测试程序和一个从字符串转换为整数的函数和一个从整数转换为字符串的函数。测试运行背靠背转换。代码不会将空字符串处理为零。

示例输出:

PASS: Col   0, Cell (wanted:          vs actual:         ) Col =   0
PASS: Col   1, Cell (wanted: a        vs actual: a       ) Col =   1
PASS: Col   2, Cell (wanted: b        vs actual: b       ) Col =   2
PASS: Col   3, Cell (wanted: c        vs actual: c       ) Col =   3
PASS: Col   4, Cell (wanted: aa       vs actual: aa      ) Col =   4
PASS: Col   5, Cell (wanted: ab       vs actual: ab      ) Col =   5
PASS: Col   6, Cell (wanted: ac       vs actual: ac      ) Col =   6
PASS: Col   7, Cell (wanted: ba       vs actual: ba      ) Col =   7
PASS: Col   8, Cell (wanted: bb       vs actual: bb      ) Col =   8
PASS: Col   9, Cell (wanted: bc       vs actual: bc      ) Col =   9
PASS: Col  10, Cell (wanted: ca       vs actual: ca      ) Col =  10
PASS: Col  11, Cell (wanted: cb       vs actual: cb      ) Col =  11
PASS: Col  12, Cell (wanted: cc       vs actual: cc      ) Col =  12
PASS: Col  13, Cell (wanted: aaa      vs actual: aaa     ) Col =  13
PASS: Col  14, Cell (wanted: aab      vs actual: aab     ) Col =  14
PASS: Col  16, Cell (wanted: aba      vs actual: aba     ) Col =  16
PASS: Col  22, Cell (wanted: baa      vs actual: baa     ) Col =  22
PASS: Col 169, Cell (wanted: abcba    vs actual: abcba   ) Col = 169
== PASS == 18 tests OK

答案 1 :(得分:1)

你在正确的轨道上:每个N字符组只是基数M的N位数字,其中M是符号数。所以你的序列是0位三元组(“”),后面是1-trnaries(“a”,“b”,“c”)等。

给出等级的位数是floor(log3(2n+1)),每个序列的第一个等级是(3**d-1)/2。所以序列中的第10000个有9个数字;第一个9位数序列(“aaaaaaaa”)是数字9841. 10000-9841是159,基数3是12220,所以第10000个序列是“aaaabccca”。

答案 2 :(得分:1)

这个简单的代码适用于您的两个示例案例,并且如果您增加了您正在使用的字符数,则应该可以使用。这是C#中的简易版本:

string Convert(int input)
{
    char[] chars = { 'a', 'b', 'c' };

    string s = string.Empty;
    while (input > 0)
    {
        int digit = (input - 1) % chars.Length;
        s = s.Insert(0, chars[digit].ToString());

        input = (input-1) / chars.Length;
    }

    return s;
}

在C中它有点复杂:

char* Convert(int input)
{
    char* chars = "abc";
    char result[50] = "";
    int numChars = strlen(chars);
    int place = 0;

    // Generate the result string digit by digit from the least significant digit
    // The string generated by this is in reverse
    while(input > 0)
    {
        int digit = (input - 1) % numChars;

        result[place] = chars[digit];

        input = (input-1) / numChars; 
        place++;
    }

    // Fix the result string by reversing it
    place -= 1;
    char *reversedResult = malloc(strlen(result));
    int i;
    for(i = 0; i <= place; i++)
    {
        reversedResult[i] = result[place-i];
    }
    reversedResult[i] = '\0';

    return reversedResult;
}

答案 3 :(得分:0)

是不是简单地a = 1,b = 2,c = 3,乘以3 ^ n,其中n是字符串中的位置?这似乎是一个有点奇怪的三元定义。无论空字符串有多长,空字符串都是等效的,因为除了a,b和c之外的任何内容都会对输出值贡献0。

此外,您问题中的两个表格似乎有冲突。在第一个表中,5 - &gt; “ab”而在第二表5中 - &gt; “AB”。这是故意的吗?如果是这样,那么这个功能并不是一对一的,你的问题就更加模糊了。