Question

这个程序从文本中取出一个单词并将其放在一个向量中;在此之后，它将每个元素与下一个元素进行比较。

所以我试图比较这样的矢量元素：

sort(words.begin(), words.end());
int cc = 1;
int compte = 1;
int i;
//browse the vector
for (i = 0; i <= words.size(); i++) {     // comparison
    if (words[i] == words[cc]) { 
        compte = compte + 1; 
    }

    else {     // displaying the word with comparison
        cout << words[i] << " Repeated :  " << compte; printf("\n");
        compte = 1; cc = i;
    }
}

我在界限中的问题：i+1可能会超出矢量边界。我该如何处理这个案子？

Answer 1

当您进行迭代并同时进行比较时，您需要更加关注初始条件和界限。最初使用笔和纸执行代码通常是个好主意。

sort(words.begin(), words.end()); // make sure !words.empty()
int cc = 0; // index of the word we need to compare.
int compte = 1; // counting of the number of occurrence.
for( size_t i = 1; i < words.size(); ++i ){
    // since you already count the first word, now we are at i=1
    if( words[i] == words[cc] ){
        compte += 1;
    }else{
        // words[i] is going to be different from words[cc].
        cout << words[cc] << " Repeated :  " << compte << '\n';
        compte = 1;
        cc = i;
    }
}
 // to output the last word with its repeat 
cout << words[cc] << " Repeated :  " << compte << '\n';

仅提供一些其他信息。有更好的方法来计算单词出现的数量。例如，可以使用unordered_map<string,int>。

希望得到这个帮助。

Answer 2

我在界限中的问题：i + 1可能会超出矢量边界。我怎么样处理这种情况？

在现代C ++编码中，可以避免索引超过向量边界的问题。使用STL容器并避免使用索引。只需花一点力气学习如何以这种方式使用容器，就不应该再看到这种“一个一个”的容器了。再次出错！作为一个好处，代码变得更容易理解和维护。

#include <iostream>
#include <vector>
#include <map>
using namespace std;

int main() {

    // a test vector of words
    vector< string > words { "alpha", "gamma", "beta", "gamma" };

    // map unique words to their appearance count
    map< string, int > mapwordcount;

    // loop over words
    for( auto& w : words )
    {
        // insert word into map
        auto ret = mapwordcount.insert( pair<string,int>( w, 1 ) );
        if( ! ret.second )
        {
            // word already present
            // so increment count
            ret.first->second++;
        }
    }

    // loop over map
    for( auto& m : mapwordcount )
    {
        cout << "word '" << m.first << "' appears " << m.second << " times\n";
    }
    return 0;
}

可生产

word 'alpha' appears 1 times
word 'beta' appears 1 times
word 'gamma' appears 2 times

https://ideone.com/L9VZt6

如果某本书或某人教您编写完整的代码

for (i = 0; i < ...

然后你应该快速逃跑并在其他地方学习现代编码。

Answer 3

C ++使用从零开始索引，例如，长度为5的数组具有索引：{0, 1, 2, 3, 4}。这意味着索引5超出了范围。

同样，给定一个字符数组arr：

 char arr[] = {'a', 'b', 'c', 'd', 'e'};

当for (int i = 0; i <= std::size(arr); ++i) { arr[i]; }等于i的长度时，循环arr将导致从范围之外读取，这会导致未定义的行为。为避免这种情况，循环必须在i等于数组长度之前停止。

for (std::size_t i = 0; i < std::size(arr); ++i) { arr[i]; }

另请注意使用std::size_t作为索引计数器的类型。这是C ++中的常见做法。

现在，让我们举一个使用standard library可以轻松完成此操作的示例。

std::sort(std::begin(words), std::end(words));
std::map<std::string, std::size_t> counts;
std::for_each(std::begin(words), std::end(words), [&] (const auto& w) { ++counts[w]; });

输出使用：

for (auto&& [word, count] : counts) {
    std::cout << word << ": " << count << std::endl;
}

Answer 4

通过multiset和upper_bound使用某些C ++ STL好东西计算相同的重复单词：

#include <iostream>
#include <vector>
#include <string>
#include <set>

int main()
{
    std::vector<std::string> words{ "one", "two", "three", "two", "one" };
    std::multiset<std::string> ms(words.begin(), words.end());
    for (auto it = ms.begin(), end = ms.end(); it != end; it = ms.upper_bound(*it))
        std::cout << *it << " is repeated: " << ms.count(*it) << " times" << std::endl;

    return 0;
}

https://ideone.com/tPYw4a

向量元素比较c ++

4 个答案: