我如何撤消lua中字符串的特定部分的string.gmatch动作

时间:2019-03-18 23:04:29

标签: string split lua

因此,我正在使用lua并用空格将字符串拆分以编写一种子语言。而且我正在尝试使其不分裂括号内的任何内容,我已经处于可以检测是否存在括号的阶段。但是我想反转括号内字符串的匹配,因为我想保留其中包含的字符串。

local function split(strng)
    local __s={}
    local all_included={}
    local flag_table={}
    local uncompiled={}
    local flagged=false
    local flagnum=0

    local c=0
    for i in string.gmatch(strng,'%S+') do
        c=c+1
        table.insert(all_included,i)
        if(flagged==false)then
            if(string.find(i,'%('or'%['or'%{'))then
                flagged=true
                flag_table[tostring(c)]=1
                table.insert(uncompiled,i)
                print'flagged'
            else 
                table.insert(__s,i)
            end
        elseif(flagged==true)then
            table.insert(uncompiled,i)
            if(string.find(i,'%)' or '%]' or '%}'))then
                flagged=false
                local __=''
                for i=1,#uncompiled do
                    __=__ .. uncompiled[i]
                end
                table.insert(__s,__)
                print'unflagged'
            end
        end
    end

    return __s;
end

这是我的拆分代码

2 个答案:

答案 0 :(得分:2)

我根本不会使用gmatch

local input = " this is a string (containg some (well, many) annoying) parentheses and should be split. The string contains  double   spaces. What should be done? And what about trailing spaces? "

local pos = 1
local words = {}
local last_start = pos
while pos <= #input do
    local char = string.byte(input, pos)

    if char == string.byte(" ") then
        table.insert(words, string.sub(input, last_start, pos - 1))
        last_start = pos + 1
    elseif char == string.byte("(") then
        local depth = 1
        while depth ~= 0 and pos + 1 < #input do
            local char = string.byte(input, pos + 1)
            if char == string.byte(")") then
                depth = depth - 1
            elseif char == string.byte("(") then
                depth = depth + 1
            end
            pos = pos + 1
        end
    end
    pos = pos + 1
end
table.insert(words, string.sub(input, last_start))

for k, v in pairs(words) do
    print(k, "'" .. v .. "'")
end

输出:

1   ''
2   'this'
3   'is'
4   'a'
5   'string'
6   '(containg some (well, many) annoying)'
7   'parentheses'
8   'and'
9   'should'
10  'be'
11  'split.'
12  'The'
13  'string'
14  'contains'
15  ''
16  'double'
17  ''
18  ''
19  'spaces.'
20  'What'
21  'should'
22  'be'
23  'done?'
24  'And'
25  'what'
26  'about'
27  'trailing'
28  'spaces?'
29  ''

有关尾随空格和其他此类问题的思考留给读者练习。我试图用我的例子强调一些可能的问题。另外,由于不想考虑this (string} should be ]parsed的使用方式,因此我只查看了一种括号。

哦,如果不关心嵌套的括号,也可以用调用string.find(input, ")", pos, true)来代替右括号来替换上面的大多数代码。

请注意,您不能像代码中那样尝试orand模式。

"%(" or "%["等于"%("

Lua将从左到右解释该表达式。 "%(是真实值Lua会将表达式简化为"%(",从逻辑上讲,它与完整表达式相同。

所以string.find(i,'%('or'%['or'%{')只会在(中找到i

答案 1 :(得分:1)

As a similar but slightly different approach to Uli's answer, I would first split by parentheses. Then you can split the the odd-numbered fields on whitespace:

split = require("split") -- https://luarocks.org/modules/telemachus/split

split__by_parentheses = function(input)
    local fields = {}
    local level = 0
    local field = ""

    for i = 1, #input do
        local char = input:sub(i, i)

        if char == "(" then
            if level == 0 then 
                -- add non-parenthesized field to list
                fields[#fields+1] = field 
                field = ""
            end
            level = level + 1
        end

        field = field .. char

        if char == ")" then
            level = level - 1
            assert(level >= 0, 'Mismatched parentheses')
            if level == 0 then 
                -- add parenthesized field to list
                fields[#fields+1] = field 
                field = ""
            end
        end
    end

    assert(level == 0, 'Mismatched parentheses')
    fields[#fields+1] = field
    return fields
end

input = " this is a string (containg some (well, many) annoying) parentheses and should be split. The string contains  double   spaces. What should be done? And what about trailing spaces? "

fields = split__by_parentheses(input)

for i, field in ipairs(fields) do
    print(("%d\t'%s'"):format(i, field))
    if i % 2 == 1 then
        for j, word in ipairs(split.split(field)) do
            print(("\t%d\t%s"):format(j, word))
        end
    end
end

outputs

1       ' this is a string '
        1
        2       this
        3       is
        4       a
        5       string
        6
2       '(containg some (well, many) annoying)'
3       ' parentheses and should be split. The string contains  double   spaces. What should be done? And what about trailing spaces? '
        1
        2       parentheses
        3       and
        4       should
        5       be
        6       split.
        7       The
        8       string
        9       contains
        10      double
        11      spaces.
        12      What
        13      should
        14      be
        15      done?
        16      And
        17      what
        18      about
        19      trailing
        20      spaces?
        21