如何使用正则表达式处理某些字符的拆分

时间:2018-05-31 21:34:05

标签: regex icu

在我的应用中,我试图根据正则表达式模式将字符串拆分为数组。我希望能够加载我的伏特模板并通过我们的自定义渲染引擎运行它们 - 只是为了更多地了解渲染引擎的工作原理。

我在下面写了正则表达式来做到这一点:

var size = 9;
var model = 'CQ2626';

function URL(size, model){

var baseSize = 580;
var shoeSize = size - 6.5;
shoeSize *= 20;
rawSize = shoeSize + baseSize;
var URL = 'http://www.adidas.com/us/' + model + '.html?forceSelSize=' + model + '_' + rawSize;
return URL

};

console.log(URL(size, model));

var page = require('webpage');
var webPage = page.create();

webPage.open(URL(size, model), function(status){

console.log("Status: " + status);
  if(status === "success") {
    webPage.render('example.png');
  }
 phantom.exit();

});

这是这样一个模板的一个例子:

"(?s)(\\{\\{.*?\\}\\}|\\{%.*?%\\}|\\{#.*?#\\})"

现在,理想情况下,我希望将其转换为如下所示的数组:

# {{ title }}
{{created_at}} {{created_location}}
============
Paragraphs are separated by a blank line.
2nd paragraph. *Italic*, **bold**, and `monospace`.

Itemized lists look like:
{% for (item in items) %}
    * {{ item }}
{% endfor %}

然而,当我运行上面的正则表达式时,我得到:

[
    "# ",
    "{{ title }}",
    "\n",
    "{{created_at}}",
    " ",
    "{{created_location}}",
    "\n============\nParagraphs are separated by a blank line\n2nd paragraph. *Italic*, **bold**, and `monospace`.\n\nItemized lists look like:"
    "{% for (item in items) %}",
    "\n* {{ item }}\n",
    "{% endfor %}"
]

正如您所见,标题部分完全消失。此外,换行符似乎存在一些问题。我有什么想法可以解决这个问题吗?

1 个答案:

答案 0 :(得分:0)

问题不在正则表达式中,而是在我用于在正则表达式上拆分的代码中。我修改了下面的代码也返回了正则表达式。

extension NSRegularExpression {
    func split(_ str: String) -> [String] {
        let range = NSRange(location: 0, length: str.characters.count)

        //get locations of matches
        var matchingRanges: [NSRange] = []
        let matches: [NSTextCheckingResult] = self.matches(in: str, options: [], range: range)
        for match: NSTextCheckingResult in matches {
            matchingRanges.append(match.range)
        }

        //invert ranges - get ranges of non-matched pieces
        var pieceRanges: [NSRange] = []

        //add first range
        pieceRanges.append(NSRange(location: 0, length: (matchingRanges.count == 0 ? str.characters.count : matchingRanges[0].location)))

        var endLoc: Int = 0
        var startLoc: Int = 0

        //add between splits ranges and last range
        for i in 0..<matchingRanges.count {
            let isLast = i + 1 == matchingRanges.count

            let location = matchingRanges[i].location
            let length = matchingRanges[i].length

            startLoc = location + length

            endLoc = isLast ? str.characters.count : matchingRanges[i + 1].location
            pieceRanges.append(NSRange(location: startLoc, length: endLoc - startLoc))
        }

        var pieces: [String] = []
        var previous: NSRange = NSRange(location: 0, length: 0)
        for range: NSRange in pieceRanges {
            let item = (str as NSString).substring(with: NSRange(location:previous.location+previous.length, length:range.location-(previous.location+previous.length)))
            pieces.append(item)

            let piece = (str as NSString).substring(with: range)
            pieces.append(piece)

            previous = range
        }

        return pieces
    }
}