我正在尝试使用以下格式从字符串中查找所有参数值:
pN stands for the Nth parameter: it can be composed of the following chars:
letters, numbers, and any char included in kSuportedNamesCharsRegEx
vNX for the the Xnt component of the value of the Nth parameter
vNX accepts arithmetical expressions. Therefore I have constructed kSuportedValuesCharsRegEx. Additionally, it could allow simple/nested list as the value.
以下是要解析的字符串的示例
p1 p2 = (v21 + v22) p3=v31-v32 p4 p5=v5
我应该获得“p1”,“p2 =(v21 + v22)”,“p3 = v31-v32”,“p4”,“p5 = v5”
可以看出,参数可能具有或不具有值。 我正在使用c ++ boost库(所以我认为我没有可用的外观)。 到现在为止,我只需要处理有价值的参数,所以我一直在使用以下内容:
static const std::string kSpecialCharsRegEx = "\\.\\{\\}\\(\\)\\\\\\*\\-\\+\\?\\|\\^\\$";
static const std::string kSuportedNamesCharsRegEx = "[A-Za-z0-9çÇñÑáÁéÉíÍóÓúÚ@%_:;,<>/"
+ kSpecialCharsRegEx + "]+";
static const std::string kSuportedValuesCharsRegEx = "([\\s\"A-Za-z0-9çÇñÑáÁéÉíÍóÓúÚ@%_:;,<>/"
+ kSpecialCharsRegEx + "]|(==)|(>=)|(<=))+";
static const std::string kSimpleListRegEx = "\\[" + kSuportedValuesCharsRegEx + "\\]";
static const std::string kDeepListRegEx = "\\[(" + kSuportedValuesCharsRegEx + "|(" + kSimpleListRegEx + "))+\\]";
// Main idea
//static const std::string stackRegex = "\\w+\\s*=\\s*[\\w\\s]+(?=\\s+\\w+=)"
// "|\\w+\\s*=\\s*[\\w\\s]+(?!\\w+=)"
// "|\\w+\\s*=\\s*\\[[\\w\\s]+\\]";
// + deep listing support
// Main regex
static const std::string kParameterRegEx =
+ "\\b" + kSuportedNamesCharsRegEx + "\\b\\s*=\\s*" + kSuportedValuesCharsRegEx + "(?=\\s+\\b" + kSuportedNamesCharsRegEx + "\\b=)"
+ "|"
+ "\\b" + kSuportedNamesCharsRegEx + "\\b\\s*=\\s*" + kSuportedValuesCharsRegEx +"(?!" + kSuportedNamesCharsRegEx + "=)"
+ "|"
+ "\\b" + kSuportedNamesCharsRegEx + "\\b\\s*=\\s*(" + kDeepListRegEx + ")";
但是,现在我需要处理非值参数,我在创建正确的正则表达式时遇到了麻烦。
有人可以帮我解决这个问题吗?提前致谢
答案 0 :(得分:2)
像mkaes建议的那样,你只需要在这里设计一个简单的语法。这是精神方法:
op = char_("-+/*");
name = +(graph - '='); // excluding `op` is not even necessary here
simple = +(graph - op);
expression = raw [
'(' >> expression >> ')'
| simple >> *(op >> expression)
];
value = expression;
definition = name >> - ('=' > value);
start = *definition;
查看 Live On Coliru 。
raw[]
就在那里,所以我们可以忽略整个表达式结构以进行标记化/验证。我只是接受了名字的所有非空格,除了操作符。
使用它像:
int main()
{
using It = std::string::const_iterator;
std::string const input = "p1 p2 = (v21 + v22) p3=v31-v32 p4 p5=v5";
It first(input.begin()), last(input.end());
Definitions defs;
if (qi::phrase_parse(first, last, grammar<It>(), qi::space, defs))
{
std::cout << "Parsed " << defs.size() << " definitions\n";
for (auto const& def : defs)
{
std::cout << def.name;
if (def.value)
std::cout << " with value expression '" << *def.value << "'\n";
else
std::cout << " with no value expression\n";
}
} else
{
std::cout << "Parse failed\n";
}
if (first != last)
std::cout << "Remaining unparsed input: '" << std::string(first,last) << "'\n";
}
打印:
Parsed 5 definitions
p1 with no value expression
p2 with value expression '(v21 + v22)'
p3 with value expression 'v31-v32'
p4 with no value expression
p5 with value expression 'v5'
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
namespace qi = boost::spirit::qi;
struct Definition {
std::string name;
boost::optional<std::string> value;
};
BOOST_FUSION_ADAPT_STRUCT(Definition, (std::string, name)(boost::optional<std::string>, value))
using Definitions = std::vector<Definition>;
template <typename Iterator, typename Skipper = qi::space_type>
struct grammar : qi::grammar<Iterator, Definitions(), Skipper>
{
grammar() : grammar::base_type(start) {
using namespace qi;
name = +(graph - '=');
simple = name;
expression = raw [
'(' >> expression >> ')'
| simple >> *(char_("+-/*") >> expression)
];
value = expression;
definition = name >> - ('=' > value);
start = *definition;
}
private:
qi::rule<Iterator> simple;
qi::rule<Iterator, std::string(), Skipper> expression, value;
qi::rule<Iterator, std::string()/*no skipper*/> name;
qi::rule<Iterator, Definition(), Skipper> definition;
qi::rule<Iterator, Definitions(), Skipper> start;
};
int main()
{
using It = std::string::const_iterator;
std::string const input = "p1 p2 = (v21 + v22) p3=v31-v32 p4 p5=v5";
It f(input.begin()), l(input.end());
Definitions defs;
if (qi::phrase_parse(f, l, grammar<It>(), qi::space, defs))
{
std::cout << "Parsed " << defs.size() << " definitions\n";
for (auto const& def : defs)
{
std::cout << def.name;
if (def.value)
std::cout << " with value expression '" << *def.value << "'\n";
else
std::cout << " with no value expression\n";
}
} else
{
std::cout << "Parse failed\n";
}
if (f != l)
std::cout << "Remaining unparsed input: '" << std::string(f,l) << "'\n";
}
答案 1 :(得分:0)
我想我找到了问题的解决方案。 与我的同事一起工作。
主要思想包含在以下示例中: http://regexr.com/38tjv
正则表达式:
(?:^|\s)(\b[a-zA-Z0-9]+\b|\b[a-zA-Z0-9]+\b\s*=\s*\b[a-zA-Z0-9\s\+\(\)]+?\b)(?=\s+\b[a-zA-Z0-9]+\b\s*=|\s*$|\s+\b[a-zA-Z0-9]+\b)
以下是解释:
static const std::string kParameterRegEx = "(?:^|\\s)" // starts string or space before, not catched
+ "(" // group of the parameter or parameter-value
+ "\\b" + kSuportedNamesCharsRegEx + "\\b" // simple names
+ "|" // or
+ "\\b" + kSuportedNamesCharsRegEx + "\\b\\s*=\\s*\\b" + kSuportedValuesCharsRegEx + "?\\b" // name-value
+ ")" // end group
+ "(?=" // followed by group of
+ "\\s+\\b" + kSuportedNamesCharsRegEx + "\\b\\s*=" // new parameter with value
+ "|" // or
+ "\\s*$" // end of string
+ "\\s+\\b" + kSuportedNamesCharsRegEx + "\\b" // new parameter without value
+ ")"; // end of following group
我希望对其他需要解析Cadence Spectre电路的人有所帮助。