Question

我制作了这段代码来计算字符串中每个字符的出现次数，但它不计算空格或任何扩展的ASCII字符......任何想法？

#include <iostream>
#include <map>
#include <string>

int main()
{
std::string input = "slowly";

std::map<char, int> occurrences;

for (std::string::iterator character = input.begin(); character != input.end(); character++)
{
    occurrences[*character] += 1;
}

for (std::map<char, int>::iterator entry = occurrences.begin(); entry != occurrences.end(); entry++)
{
    std::cout << entry->first << '=' << entry->second << std::endl;
}
}

如果有更快的算法处理大量字符以获得相同的结果我会感激吗？

Answer 1

对于真正的长字符串（超过数百万个元素），您可以将字符串分成较小的部分，并将每个部分传递给处理其小部分并添加到其小地图的线程，所有线程完成后，最后合并地图。否则，对于字符串高达数千甚至数万个字符，它可能没有多大区别，对于少于几千字符串的字符串，可能需要更多时间来设置线程并合并映射而不是当前线性方法。

除非您希望对结果进行排序，否则请改用std::unordered_map。

Answer 2

我认为你的代码很好，因为它适用于我，我使用visual studio c ++ 2010在Windows 7机器64上执行它。

#include "stdafx.h"
#include <iostream>
#include <map>
#include <string>

int main()
{
std::string input = "And if there's any faster algorithm to deal with a large amount of characters to get the same results i would be thankful ?? How to count the occurrence of the spaces and any ascii characters in a string?";

std::map<char, int> occurrences;

for (std::string::iterator character = input.begin(); character != input.end(); character++)
{
    occurrences[*character] += 1;
}

for (std::map<char, int>::iterator entry = occurrences.begin(); entry != occurrences.end(); entry++)
{
    std::cout << entry->first << '=' << entry->second << std::endl;
}
}

以下是执行上述代码时的结果：

<强>更新

这是一个新代码，其中iam使用包含变量input的上述值的文件

#include "stdafx.h"
#include <iostream>
#include <map>
#include <string>
#include <fstream>
#include <streambuf>

std::ifstream inputFile("text.txt");
std::string input((std::istreambuf_iterator<char>(inputFile)),
                 std::istreambuf_iterator<char>());


int main()
{

std::map<char, int> occurrences;

for (std::string::iterator character = input.begin(); character != input.end(); character++)
{
    occurrences[*character] += 1;
}

for (std::map<char, int>::iterator entry = occurrences.begin(); entry != occurrences.end(); entry++)
{
    std::cout << entry->first << '=' << entry->second << std::endl;
}
}

我们有相同的结果：

Answer 3

好吧，这段代码会计算字符串中的字符数。你的例子，如果你写了std::string input = "slowly \tmedium \rfast \n";，将计算3个空格（32），一个标签（8），一个cr（13）和一个换行（10）。

当然，如果你用以下内容读取文件：

std::string input;
...
in >> input;

你明确地要求空白的单词，这样你就找不到空格或任何其他空格（\t\r\n）。

如果要计算文件中的所有字符，则必须对以二进制模式打开的文件使用二进制读取（in.read(char *buf, streamsize size)）。

最后提醒：如果您的文件使用像UTF8这样的多字节编码字符集，则必须准备将单个é字符视为2字节0xc3 oxc9 ...

对于速度，您可以先使用数组而不是地图来计算字符数，并且可以选择多线程程序以获取Joachim解释的真正大文件。

如何计算字符串中空格和任何ascii字符的出现次数？

3 个答案: