Question

编辑：我已经主要改变了使用迭代器来跟踪位和字符串中的连续位置，并通过const ref传递后者。现在，当我多次将样本输入复制到自身以测试时钟时，所有内容都会在10秒内完成，以获得真正长位和字符串，甚至最多50行样本输入。但是，当我提交时，CodeEval表示该过程在10秒后中止。正如我所提到的那样，他们现在不会分享他们的意见，所以他们现在可以分享＆＃34;扩展＆＃34;在样本输入工作中，我不知道如何继续。任何关于增加我的递归性能的额外改进的想法都将非常感激。

注意：Memoization是一个很好的建议，但在这种情况下我无法弄清楚如何实现它，因为我不确定如何在静态查找表中存储bit-to-char关联。我唯一想到的是将位值转换为它们对应的整数，但这会冒长位字符串的整数溢出的风险，而且计算起来似乎需要很长时间。此处有关于备忘的进一步建议也将受到高度赞赏。

这实际上是温和的CodeEval挑战之一。他们不会分享样本输入或输出以适应中等挑战，但输出＆＃34;失败错误＆＃34;简单地说＆＃34;在10秒后中止，＆＃34;所以我的代码在某处挂了。

作业很简单。您将文件路径作为单个命令行参数。文件的每一行将包含0和1的序列以及由空格分隔的As和B序列。您将根据以下两个规则确定二进制序列是否可以转换为字母序列：

1）每个0都可以转换为任何非空的As序列（例如，＆＃39; A＆＃39;，＆＃39; AA＆＃39;，＆＃39; AAA＆＃39;等。）

2）每个1可以转换为任何非空的As OR B序列（例如，＆＃39; A＆＃39;，＆＃39; AA＆＃39;等等，或者＆＃39; B＆＃39;，＆＃39; BB＆＃39;等）（但不是字母的混合）

约束是从文件中处理多达50行，并且二进制序列的长度在[1,150]中，字母序列的长度在[1,1000]中。

最明显的启动算法是递归执行此操作。我想出的是每个位，首先折叠整个下一个允许的字符组，测试缩短的位和字符串。如果失败，请一次从被杀死的角色组中添加一个角色并再次打电话。

这是我的完整代码。为简洁起见，我删除了cmd-line参数错误检查。

#include <iostream>
#include <fstream>
#include <string>
#include <iterator>

using namespace std;

//typedefs
typedef string::const_iterator str_it;

//declarations
//use const ref and iterators to save time on copying and erasing
bool TransformLine(const string & bits, str_it bits_front, const string & chars, str_it chars_front);

int main(int argc, char* argv[])
{
    //check there are at least two command line arguments: binary executable and file name
    //ignore additional arguments
    if(argc < 2)
    {
        cout << "Invalid command line argument. No input file name provided." << "\n"
             << "Goodybe...";
        return -1;
    }

    //create input stream and open file
    ifstream in;
    in.open(argv[1], ios::in);
    while(!in.is_open())
    {
        char* name;
        cout << "Invalid file name. Please enter file name: ";
        cin >> name;
        in.open(name, ios::in);
    }

    //variables
    string line_bits, line_chars;

    //reserve space up to constraints to reduce resizing time later
    line_bits.reserve(150);
    line_chars.reserve(1000);

    int line = 0;

    //loop over lines (<=50 by constraint, ignore the rest)
    while((in >> line_bits >> line_chars) && (line < 50))
    {
        line++;     
        //impose bit and char constraints
        if(line_bits.length() > 150 ||
           line_chars.length() > 1000)
            continue; //skip this line

        (TransformLine(line_bits, line_bits.begin(), line_chars, line_chars.begin()) == true) ? (cout << "Yes\n") : (cout << "No\n");
    }

    //close file
    in.close();

    return 0;
}

bool TransformLine(const string & bits, str_it bits_front, const string & chars, str_it chars_front)
{
    //using iterators so store current length as local const
    //can make these const because they're not altered here
    int bits_length = distance(bits_front, bits.end());
    int chars_length = distance(chars_front, chars.end());

    //check success rule
    if(bits_length == 0 && chars_length == 0)
        return true;

    //Check fail rules:
    //1. next bit is 0 but next char is B
    //2. bits length is zero (but char is not, by previous if)
    //3. char length is zero (but bits length is not, by previous if)
    if((*bits_front == '0' && *chars_front == 'B') ||
        bits_length == 0 ||
        chars_length == 0)
        return false;

    //we now know that chars_length != 0 => chars_front != chars.end()

    //kill a bit and then call recursively with each possible reduction of front char group
    bits_length = distance(++bits_front, bits.end());

    //current char group tracker
    const char curr_char_type = *chars_front; //use const so compiler can optimize
    int curr_pos = distance(chars.begin(), chars_front); //position of current front in char string

    //since chars are 0-indexed, the following is also length of current char group
    //start searching from curr_pos and length is relative to curr_pos so subtract it!!!    
    int curr_group_length = chars.find_first_not_of(curr_char_type, curr_pos)-curr_pos;

    //make sure this isn't the last group!
    if(curr_group_length < 0 || curr_group_length > chars_length)
        curr_group_length = chars_length; //distance to end is precisely distance(chars_front, chars.end()) = chars_length

    //kill the curr_char_group
    //if curr_group_length = char_length then this will make chars_front = chars.end()
    //and this will mean that chars_length will be 0 on next recurssive call.
    chars_front += curr_group_length;
    curr_pos = distance(chars.begin(), chars_front);

    //call recursively, adding back a char from the current group until 1 less than starting point
    int added_back = 0;
    while(added_back < curr_group_length) 
    {
        if(TransformLine(bits, bits_front, chars, chars_front))
            return true;

        //insert back one char from the current group
        else
        {
            added_back++;
            chars_front--; //represents adding back one character from the group
        }

    }
    //if here then all recursive checks failed so initial must fail
    return false;
}

他们给出了以下测试用例，我的代码正确解决了这个问题：

示例输入：

1 | 1010 AAAAABBBBAAAA

2 | 00 AAAAAA

3 | 01001110 AAAABAAABBBBBBAAAAAAA

4 | 1100110 BBAABABBA

正确输出：

1 |是

2 |是

3 |是

4 |否

由于转换是可能的，当且仅当它的副本是，我尝试只是将每个二进制和字母序列复制到它自己不同的时间，看看时钟是如何进行的。即使对于非常长的位和字符串以及许多行，它也在10秒内完成。

我的问题是：由于CodeEval仍然说它的运行时间超过10秒，但他们没有分享他们的输入，有没有人有任何进一步的建议来改善这种递归的性能？或者可能采用完全不同的方法？

提前感谢您的帮助！

Answer 1

这是我发现的：

通过常量参考
字符串和其他大型数据结构应通过常量引用传递这允许编译器传递指向原始对象的指针，而不是制作数据结构的副本。

调用一次功能，保存结果
您正在拨打bits.length()两次。您应该调用一次并将结果保存在常量变量中。这允许您在不调用函数的情况下再次检查状态。

对于时间紧迫的程序，函数调用很昂贵。

使用常量变量
如果您在分配后不打算修改变量，请使用声明中的const：

const char curr_char_type = chars[0];

const允许编译器执行更高阶的优化并提供安全检查。

更改数据结构
由于您可能在字符串的中间执行插入，因此您应该为字符使用不同的数据结构。插入后std::string数据类型可能需要重新分配并将字母向下移动。使用std::list<char>插入速度更快，因为链接列表仅交换指针。可能存在折衷，因为链表需要为每个字符动态分配内存。

在字符串中预留空间
创建目标字符串时，应使用为最大字符串预分配或保留空间的构造函数。这将阻止std::string重新分配。重新分配是昂贵的。

不要删除
你真的需要删除字符串中的字符吗？通过使用开始和结束索引，您可以覆盖现有字母而无需擦除整个字符串。部分擦除是昂贵的。完全擦除不是。

如需更多帮助，请发送至StackExchange的Code Review。

Answer 2

这是一个经典的递归问题。然而，递归的简单实现将导致对先前计算的函数值的指数次数的重新评估。使用一个更简单的示例进行说明，将以下两个函数的运行时间与一个相当大的N进行比较。不要担心int溢出。

int RecursiveFib(int N)
{
if(N<=1)
return 1;
return RecursiveFib(N-1) + RecursiveFib(N-2);
}

int IterativeFib(int N)
{
if(N<=1)
return 1;
int a_0 = 1, a_1 = 1;
for(int i=2;i<=N;i++)
{
int temp = a_1;
a_1 += a_0;
a_0 = temp;
}
return a_1;
}

您需要在此处遵循类似的方法。有两种常见的方法可以解决这个问题 - 动态编程和记忆。记忆是修改方法的最简单方法。下面是一个memoized斐波那契实现，以说明如何加快您的实施。

int MemoFib(int N)
{
static vector<int> memo(N, -1);
if(N<=1)
return 1;
int& res = memo[N];
if(res!=-1)
return res;
return res = MemoFib(N-1) + MemoFib(N-2);
}

Answer 3

你的失败信息是“10秒后中止” - 暗示程序工作正常，但是花了太长时间。这是可以理解的，因为你的递归程序需要花费更多的时间来处理更长的输入字符串 - 它适用于短（2-8位）字符串，但是需要花费大量时间来处理100多个字符串（测试允许）。要了解运行时间是如何上升的，您应该构建一些更长的测试输入并查看它们运行的时间。尝试像

这样的事情

0000000011111111   AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBAAAAAAAA
00000000111111110000000011111111  AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBBBBBBBAAAAAAAA

更长。您需要能够处理最多150个数字和1000个字母。

Answer 4

在CodeEval，您可以提交一个＆＃34;解决方案＆＃34;只输出输入内容，然后执行此操作以收集测试集。它们可能有变化，因此您可能希望提交几次以收集更多样本。其中一些太难以手动解决了......你可以手动解决的问题也会在CodeEval上运行得非常快，即使使用效率低下的解决方案也是如此，所以需要考虑这些问题。

无论如何，我在CodeEval上做了同样的问题（使用VB的所有东西），我的解决方案递归地寻找＆＃34;下一个索引＆＃34; A和B都取决于＆＃34;当前＆＃34; index是我在翻译中的位置（在递归方法中首先检查停止条件之后）。我没有使用记忆，但这可能会帮助加快记忆速度。

PS，我还没有运行你的代码，但似乎很奇怪递归方法包含一个while循环，在这个循环中调用递归方法......因为它已经递归，因此应该涵盖每个场景，那个while（）循环是必要的吗？

递归字符串转换

4 个答案: