Question

我正在尝试解决一项任务，并且不确定我是否正在使用合适的数据结构。我的任务是找出句子是否由唯一字符组成，并因此返回布尔值。

这是我的功能：

bool use_map(string sentence) {
    map<int, string> my_map;

    for (string::size_type i = 0; i <= sentence.length(); i++) {
        unsigned int index = (int)sentence[i];    
        if (my_map.find(index) != my_map.end())
            return false;       
        my_map[index] = sentence[i];
    }

    return true;    
}

我发现只有适合我的地图结构。也许我错过了什么？

也许最好在PHP使用动态数组这样的东西？

我正在尝试使用哈希表解决方案。

Answer 1

一种非常简单（但内存很昂贵）的方式是：

bool use_map(const std::string& sentence)
{
    std::set<char> chars(sentence.begin(), sentence.end());
    return chars.size() == sentence.size();
}

如果没有重复的字符，字符串和集合的大小将相等。

@Jonathan Leffler在评论中提出了一个很好的观点：句子通常包含几个空格，所以这将返回false。你想要过滤空间。尽管如此，std::set应该是您的首选容器。

修改

这是O（n）解决方案的一个想法，没有额外的内存。只需使用查找表，您可以在其中标记以前是否看到过char：

bool no_duplicates(const std::string& sentence) { static bool table[256]; std::fill(table, table+256, 0); for (char c : sentence) { // don't test spaces if (c == ' ') continue; // add more tests if needed const unsigned char& uc = static_cast<unsigned char>(c); if (table[uc]) return false; table[uc] = true; } return true; }

Answer 2

其他答案建议std::set，这是一个解决方案。但是，他们会复制std::set内的所有字符，然后获得set的大小。你真的不需要这个，你可以使用std::set::insert的返回值来避免它。类似的东西：

std::set< char > my_set;
for (std::string::size_type ii = 0; ii < sentence.size(); ++ii) 
{
    if( ! my_set.insert( sentence[ ii ] ).second )
    {
        return false;
    }
}

这样你就会：

停在第一个重复的字符上，你不会复制整个字符串（不必要地）
您将避免在代码

int

将节省内存 - 如果您实际上不需要std::map< int, std::string >::second

此外，请确保您需要“计算”所有char或者您想要跳过其中一些（例如空格，逗号，问号等）

Answer 3

我想一个简单的方法是将所有字符存储在不允许重复的关联容器中，例如std::set，并检查它是否包含单个值：

#include <set>
#include <string>

bool has_unique_character(std::string const& str)
{
    std::set<char> s(begin(str), end(str));
    return (s.size() == str.size());
}

Answer 4

这个怎么样？当然有一个案例问题......

bool use_map(const std::string& sentence)
{
    std::vector<bool> chars(26, false);
    for(std::string::const_iterator i = sentence.begin(); i != sentence.end(); ++i) {
        if(*i == ' ' || *i - 'a' > 25 || *i - 'a' < 0) {
            continue;
        } else if(chars[*i - 'a']) {
            return false;
        } else {
            chars[*i - 'a'] = true;
        }
    }

    return true;
}

Answer 5

对字符进行排序，然后查找两个字符相等的相邻字母字符对。像这样：

std::string my_sentence = /* whatever */
std::sort(my_sentence.begin(), my_sentence.end());
std::string::const_iterator it =
    std::adjacent_find(my_sentence.begin(), my_sentence.end());
while (it != my_sentence.end() && isalpha((unsigned char)*it)
    it = std::adjacent_find(++it, my_sentence.end());
if (it == my_sentence.end())
    std::cout << "No duplicates.\n";
else
    std::cout << "Duplicated '" << *it << "'.\n";

Answer 6

如果允许使用额外的内存，请使用哈希表：
遍历数组，检查当前元素是否已经过哈希处理。如果是的话，你发现了一个重复。如果不是，请将其添加到哈希。这将是线性，但需要额外的内存。

如果原始序列元素的范围非常小，则可以简单地使用范围大小的数组，而不是散列，而是在存储桶排序中执行。例如

bool hasDuplicate( string s )
{
   int n = s.size();
   vector<char> v( 256, 0 );
   for( int i = 0; i < n; ++i )
      if( v[ s[ i ] ] ) // v[ hash( s[i] ) ] here in case of hash usage
         return true;
      else
         v[ s[ i ] ] = 1; // and here too
   return false;
}

最后，如果您不允许使用额外的内存，您可以对其进行排序并检查两个相邻的元素在一次通过中是否相等。这将花费 O（nlogn）时间。不需要集合或地图：）

Answer 7

这是最快的解决方案：

bool charUsed[256];
bool isUnique(string sentence) {
    int i;
    for(i = 0; i < 256; ++i) {
        charUsed[i] = false;
    }

    int n = s.size();
    for(i = 0; i < n; ++i) {
        if (charUsed[(unsigned char)sentence[i]]) {
            return false;
        }
        charUsed[(unsigned char)sentence[i]] = true;
    }
    return true;
}

什么数据结构更好用于查找句子是否由唯一字符组成？

7 个答案: