从随机三联体字母表中猜出一个单词

时间:2014-11-28 17:37:48

标签: java c++ string algorithm data-structures

我遇到了这个问题,我无法弄清楚它的完整解决方案。 (来源:http://www.careercup.com/question?id=5678056593162240

  

给定功能

     

getRandomTripplet()

     

从字符串中返回一个随机的三字母。你没有   知道字符串使用你必须正确调用这个函数   猜字符串。还给出了字符串的长度。

     

让我们说字符串是函数getRandomTriplet的helloworld   返回像

这样的东西      

hlo hew wld owo

     

该函数保持字母的相对顺序。所以它会   永远不会回来

     

ohl因为h在字符串中的o之前。因为w在e之后欠了

     

字符串未知您只能获得字符串的长度。

到目前为止,我的方法是运行getRandomTripplet()1000次,然后通过获取具有最高出现次数(概率> 1/10)来查找重复字符。

我是在正确的轨道上吗?

感谢您的帮助。 干杯!

1 个答案:

答案 0 :(得分:1)

在未能产生概率解决方案之后,我想出了这种蛮力方法,似乎可以产生正确的结果。对于某些字符串,总状态空间非常大,但算法最终会收敛:)

它可能不是最有效的解决方案,但似乎确实可以解决问题。下面的代码可以进行相当多的优化,但作为一个例子它应该足够了。

#include <iostream>
#include <set>
#include <functional>

#include <boost/random/mersenne_twister.hpp>
#include <boost/random/uniform_int_distribution.hpp>

// the input string for the random generator
static const std::string input("hello_world");
// an "empty" character (something not present in the input string)
static const char empty = '?';
// a constant with the input length (the only known fact about the input in the algorithm)
static const unsigned length = input.length();

// randomization algorithm returning triplets
std::string randomTriplet() {
    static boost::random::mt19937 gen;
    static boost::random::uniform_int_distribution<> dist(0, input.length()-1);

    std::set<unsigned> indices;
    while(indices.size() < 3)
        indices.insert(dist(gen));

    std::string result;
    for(auto i : indices)
        result.push_back(input[i]);
    return result;
}

// runs a functor for all possible combinations of input triplet acc to the rules
void allCombinations(const std::string& triplet, const std::function<void(const std::string&)>& functor) {
    for(unsigned a=0;a<length-2;++a)
        for(unsigned b=a+1;b<length-1;++b)
            for(unsigned c=b+1;c<length;++c) {
                std::string tmp(length, empty);
                tmp[a] = triplet[0];
                tmp[b] = triplet[1];
                tmp[c] = triplet[2];

                functor(tmp);
            }
}

// tries to merge two strings, and returns an empty string if it is not possible
std::string putTogether(const std::string& first, const std::string& second) {
    std::string result(length, empty);

    for(unsigned a=0;a<length;++a)
        if((first[a] == empty) || (first[a] == second[a]))
            result[a] = second[a];
        else if(second[a] == empty)
            result[a] = first[a];
        else if(first[a] != second[a])
            return std::string();

    return result;
}

// run a single iteration on a set of input states and a triplet
std::set<std::string> iterate(const std::set<std::string>& states, const std::string& triplet) {
    std::set<std::string> result;

    // try all combinations on all states
    for(auto& s : states) {
        allCombinations(triplet, [&](const std::string& val) {
            // and if merge is possible, insert it into the result
            const std::string together = putTogether(s, val);
            if(!together.empty())
                result.insert(together);
        });
    };

    return result;
}

int main(int argc, char*argv[]) {
    // the current state space (all possible strings given the observations so far)
    std::set<std::string> states;

    // initialisation - take the first triplet and generate all combinations of this triplet
    allCombinations(randomTriplet(), [&](const std::string& val) { states.insert(val); });

    // iterate - the number of iterations is rather arbitrary. We cannot stop when the solution
    //   count is 1, because some input strings (like "hello world", where the double l and o 
    //   are the problem) don't have a unique solution
    for(unsigned a=0;a<10000;++a) {
        states = iterate(states, randomTriplet());
        std::cout << "\r" << "iteration #" << a << ", combinations count = " << states.size() << "   " << std::flush;

        // however, if we do find a single solution, we don't have to go on
        if(states.size() == 1)
            break;
    }
    std::cout << std::endl;

    // print them all
    for(const auto& i : states)
        std::cout << i << std::endl;

    return 0;
}

有趣的是,对于&#34; hello_world&#34;输入,输出是:

iteration #9999, combinations count = 6     
hello_world
helol_world
hlelo_world
hleol_world
lhelo_world
lheol_world

三重奏&#39; l&#39;和双重的&#39;字符产生的歧义只能用三个字符的观察方法来决定。