因此,我正在使用lua并用空格将字符串拆分以编写一种子语言。而且我正在尝试使其不分裂括号内的任何内容,我已经处于可以检测是否存在括号的阶段。但是我想反转括号内字符串的匹配,因为我想保留其中包含的字符串。
local function split(strng)
local __s={}
local all_included={}
local flag_table={}
local uncompiled={}
local flagged=false
local flagnum=0
local c=0
for i in string.gmatch(strng,'%S+') do
c=c+1
table.insert(all_included,i)
if(flagged==false)then
if(string.find(i,'%('or'%['or'%{'))then
flagged=true
flag_table[tostring(c)]=1
table.insert(uncompiled,i)
print'flagged'
else
table.insert(__s,i)
end
elseif(flagged==true)then
table.insert(uncompiled,i)
if(string.find(i,'%)' or '%]' or '%}'))then
flagged=false
local __=''
for i=1,#uncompiled do
__=__ .. uncompiled[i]
end
table.insert(__s,__)
print'unflagged'
end
end
end
return __s;
end
这是我的拆分代码
答案 0 :(得分:2)
我根本不会使用gmatch
。
local input = " this is a string (containg some (well, many) annoying) parentheses and should be split. The string contains double spaces. What should be done? And what about trailing spaces? "
local pos = 1
local words = {}
local last_start = pos
while pos <= #input do
local char = string.byte(input, pos)
if char == string.byte(" ") then
table.insert(words, string.sub(input, last_start, pos - 1))
last_start = pos + 1
elseif char == string.byte("(") then
local depth = 1
while depth ~= 0 and pos + 1 < #input do
local char = string.byte(input, pos + 1)
if char == string.byte(")") then
depth = depth - 1
elseif char == string.byte("(") then
depth = depth + 1
end
pos = pos + 1
end
end
pos = pos + 1
end
table.insert(words, string.sub(input, last_start))
for k, v in pairs(words) do
print(k, "'" .. v .. "'")
end
输出:
1 ''
2 'this'
3 'is'
4 'a'
5 'string'
6 '(containg some (well, many) annoying)'
7 'parentheses'
8 'and'
9 'should'
10 'be'
11 'split.'
12 'The'
13 'string'
14 'contains'
15 ''
16 'double'
17 ''
18 ''
19 'spaces.'
20 'What'
21 'should'
22 'be'
23 'done?'
24 'And'
25 'what'
26 'about'
27 'trailing'
28 'spaces?'
29 ''
有关尾随空格和其他此类问题的思考留给读者练习。我试图用我的例子强调一些可能的问题。另外,由于不想考虑this (string} should be ]parsed
的使用方式,因此我只查看了一种括号。
哦,如果不关心嵌套的括号,也可以用调用string.find(input, ")", pos, true)
来代替右括号来替换上面的大多数代码。
请注意,您不能像代码中那样尝试or
或and
模式。
"%(" or "%["
等于"%("
Lua将从左到右解释该表达式。 "%(
是真实值Lua会将表达式简化为"%("
,从逻辑上讲,它与完整表达式相同。
所以string.find(i,'%('or'%['or'%{')
只会在(
中找到i
。
答案 1 :(得分:1)
As a similar but slightly different approach to Uli's answer, I would first split by parentheses. Then you can split the the odd-numbered fields on whitespace:
split = require("split") -- https://luarocks.org/modules/telemachus/split
split__by_parentheses = function(input)
local fields = {}
local level = 0
local field = ""
for i = 1, #input do
local char = input:sub(i, i)
if char == "(" then
if level == 0 then
-- add non-parenthesized field to list
fields[#fields+1] = field
field = ""
end
level = level + 1
end
field = field .. char
if char == ")" then
level = level - 1
assert(level >= 0, 'Mismatched parentheses')
if level == 0 then
-- add parenthesized field to list
fields[#fields+1] = field
field = ""
end
end
end
assert(level == 0, 'Mismatched parentheses')
fields[#fields+1] = field
return fields
end
input = " this is a string (containg some (well, many) annoying) parentheses and should be split. The string contains double spaces. What should be done? And what about trailing spaces? "
fields = split__by_parentheses(input)
for i, field in ipairs(fields) do
print(("%d\t'%s'"):format(i, field))
if i % 2 == 1 then
for j, word in ipairs(split.split(field)) do
print(("\t%d\t%s"):format(j, word))
end
end
end
outputs
1 ' this is a string '
1
2 this
3 is
4 a
5 string
6
2 '(containg some (well, many) annoying)'
3 ' parentheses and should be split. The string contains double spaces. What should be done? And what about trailing spaces? '
1
2 parentheses
3 and
4 should
5 be
6 split.
7 The
8 string
9 contains
10 double
11 spaces.
12 What
13 should
14 be
15 done?
16 And
17 what
18 about
19 trailing
20 spaces?
21