Question

我有一个文本文件，需要将每行转换为整数。

这些行可以以“＃”开头来表示注释。另外，数据之后也可能是内联注释...同样由“＃”表示

所以我有下面的例子：

QString time = "5000 #this is 5 seconds";  // OK      
QString time = "  5000 # this is 5 seconds"; // OK..free spaceis allowed at start        
QString time = "5000.00 #this is 5 seconds"; // invalid...no decimal 
QString time = "s5000 # this is 5 seconds"; // invalid...does not start with numerical character

如何处理这些情况？我的意思是在上述所有4个示例中，除了后两个示例，我需要提取“ 5000”。如何找出最后一个无效？

所以我的意思是处理此任务的最佳防故障代码是什么？

Answer 1

您可以使用此正则表达式从第一个分组模式中验证并提取数字，以捕获您的号码，

^\s*(\d+)\b(?!\.)

说明：

^-字符串的开头
\s*-在数字前留有可选空格
(\d+)-捕获数字并将其置于第一个分组模式中
\b-确保数字与较大的文本不完全匹配，因为前面存在负面的展望
(?!\.)-如果数字后面有小数，则拒绝匹配

Demo1

如果只有最后一个无效，则可以使用此正则表达式从前三个条目中捕获数字，

^\s*(\d+)

Demo2

Answer 2

另一个使用std::regex的示例。留给读者练习是将QString转换为string_view。

#include <regex>
#include <string_view>
#include <iostream>
#include <string>
#include <optional>

std::optional<std::string> extract_number(std::string_view input)
{
    static constexpr char expression[] = R"xx(^\s*(\d+)\s*(#.*)?$)xx";
    static const auto re = std::regex(expression);

    auto result = std::optional<std::string>();
    auto match = std::cmatch();
    const auto matched = std::regex_match(input.begin(), input.end(), match, re);
    if (matched)
    {
        result.emplace(match[1].first, match[1].second);
    }

    return result;
}

void emit(std::string_view candidate, std::optional<std::string> result)
{
    std::cout << "offered: " << candidate << " - result : " << result.value_or("no match") << '\n';
}

int main()
{
    const std::string_view candidates[] = 
    {
"5000 #this is 5 seconds",
"  5000 # this is 5 seconds",
"5000.00 #this is 5 seconds",
"s5000 # this is 5 seconds"
    };

    for(auto candidate : candidates)
    {
        emit(candidate, extract_number(candidate));
    }
}

预期输出：

offered: 5000 #this is 5 seconds - result : 5000
offered:   5000 # this is 5 seconds - result : 5000
offered: 5000.00 #this is 5 seconds - result : no match
offered: s5000 # this is 5 seconds - result : no match

https://coliru.stacked-crooked.com/a/2b0e088e6ed0576b

验证字符串的整数部分

2 个答案: