Question

我正在尝试从像这样结构化的文本文件中读取整数....

ALS 46000
BZK 39850
CAR 38000
//....

使用ifstream。

我考虑了两个选项。

1）正则表达式使用Boost

2）创建一次性字符串（即我一言不发，不做任何事情，然后读入分数）。但是，这是最后的手段。

有没有办法在C ++中表达我希望ifstream只读取整数文本？如果事实证明有一种更简单的方法可以实现这一点，我不愿意使用正则表达式。

Answer 1

为什么让简单的事情复杂化？

这是错的：

ifstream ss("C:\\test.txt");

int score;
string name;
while( ss >> name >> score )
{
    // do something with score
}

Answer 2

修改实际上possible to work on streams directly与我之前建议的精神相比，使用解析器：

+(omit[+(alpha|blank)] >> int_)

和一行代码（变量定义除外）：

void extract_file() { std::ifstream f("E:/dd/dd.trunk/sandbox/text.txt"); boost::spirit::istream_iterator it_begin(f), it_end; // extract all numbers into a vector std::vector<int> vi; parse(it_begin, it_end, +(omit[+(alpha|blank)] >> int_), vi); // print them to verify std::copy(vi.begin(), vi.end(), std::ostream_iterator<int>(std::cout, ", " )); }

您可以使用一行将所有数字一次性转换为矢量，但不能更简单。

如果您不介意使用boost.spirit2。只从一行获取数字的解析器是

omit[+(alpha|blank)] >> int_

提取一切是

+(alpha|blank) >> int_

请参阅下面的整个程序（使用VC10 Beta 2测试）：

#include <boost/spirit/include/qi.hpp> #include <iostream> #include <string> #include <cstring> #include <vector> #include <fstream> #include <algorithm> #include <iterator> using std::cout; using namespace boost::spirit; using namespace boost::spirit::qi; void extract_everything(std::string& line) { std::string::iterator it_begin = line.begin(); std::string::iterator it_end = line.end(); std::string s; int i; parse(it_begin, it_end, +(alpha|blank)>>int_, s, i); cout << "string " << s << "followed by nubmer " << i << std::endl; } void extract_number(std::string& line) { std::string::iterator it_begin = line.begin(); std::string::iterator it_end = line.end(); int i; parse(it_begin, it_end, omit[+(alpha|blank)] >> int_, i); cout << "number only: " << i << std::endl; } void extract_line() { std::ifstream f("E:/dd/dd.trunk/sandbox/text.txt"); std::string s; int i; // iterated file line by line while(getline(f, s)) { cout << "parsing " << s << " yields:\n"; extract_number(s); // extract_everything(s); } } void extract_file() { std::ifstream f("E:/dd/dd.trunk/sandbox/text.txt"); boost::spirit::istream_iterator it_begin(f), it_end; // extract all numbers into a vector std::vector<int> vi; parse(it_begin, it_end, +(omit[+(alpha|blank)] >> int_), vi); // print them to verify std::copy(vi.begin(), vi.end(), std::ostream_iterator<int>(std::cout, ", " )); } int main(int argc, char * argv[]) { extract_line(); extract_file(); return 0; }

输出：

parsing ALS 46000 yields: number only: 46000 string ALS followed by nubmer 46000 parsing BZK 39850 yields: number only: 39850 string BZK followed by nubmer 39850 parsing CAR 38000 yields: number only: 38000 string CAR followed by nubmer 38000 46000, 39850, 38000,

Answer 3

您可以致电ignore跳过指定数量的字符。

istr.ignore(4);

您也可以告诉它停在分隔符处。您仍然需要知道前导字符串可能的最大字符数，但这也适用于较短的前导字符串：

istr.ignore(10, ' ');

您还可以编写一个只读取字符的循环，直到您看到第一个数字字符：

char c;
while (istr.getchar(c) && !isdigit(c))
{
    // do nothing
}
if (istr && isdigit(c))
    istr.putback(c);

Answer 4

这里是：P

private static void readFile(String fileName) {

        try {
            HashMap<String, Integer> map = new HashMap<String, Integer>();
            File file = new File(fileName);

            Scanner scanner = new Scanner(file).useDelimiter(";");
            while (scanner.hasNext()) {
                String token = scanner.next();
                String[] split = token.split(":");
                if (split.length == 2) {
                    Integer count = map.get(split[0]);
                    map.put(split[0], count == null ? 1 : count + 1);
                    System.out.println(split[0] + ":" + split[1]);
                } else {
                    split = token.split("=");
                    if (split.length == 2) {
                        Integer count = map.get(split[0]);
                        map.put(split[0], count == null ? 1 : count + 1);
                        System.out.println(split[0] + ":" + split[1]);
                    }
                }
            }
            scanner.close();
            System.out.println("Counts:" + map);
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        readFile("test.txt");
    }
}

Answer 5

fscanf(file, "%*s %d", &num);

或％05d，如果你有前导零和固定宽度5 ....

有时用C ++做事的最快方法是使用C.：）

Answer 6

您可以创建一个ctype facet，将字母分类为空格。创建使用此构面的区域设置，然后使用该区域设置填充流。有了这个，您可以从流中提取数字，但所有字母都将被视为空格（即，当您提取数字时，字母将被忽略，就像空格或制表符一样）：

这样的语言环境可能如下所示：

#include <iostream>
#include <locale>
#include <vector>
#include <algorithm>

struct digits_only: std::ctype<char> 
{
    digits_only(): std::ctype<char>(get_table()) {}

    static std::ctype_base::mask const* get_table()
    {
        static std::vector<std::ctype_base::mask> 
            rc(std::ctype<char>::table_size,std::ctype_base::space);

        if (rc['0'] == std::ctype_base::space)
            std::fill_n(&rc['0'], 9, std::ctype_base::mask());
        return &rc[0];
    }
};

使用它的示例代码可能如下所示：

int main() {
    std::cin.imbue(std::locale(std::locale(), new digits_only()));

    std::copy(std::istream_iterator<int>(std::cin), 
        std::istream_iterator<int>(),
        std::ostream_iterator<int>(std::cout, "\n"));
}

使用您的示例数据，我从中获得的输出如下所示：

46000
39850
38000

请注意，就目前情况而言，我已将其写为接受仅位数。如果（例如）你正在阅读浮点数，你也想保留'。' （或特定于语言环境的等效项）作为小数点。处理事情的一种方法是从普通ctype表的副本开始，然后将要忽略的内容设置为space。

从带有单词的文本文件中读取整数

6 个答案: