我正在尝试在一对匹配的括号中获取内容,这对括号可能有0对或更多对嵌套括号。这是一个众所周知的问题。例如,当我提供"Well (this part (should be) (in the (result)), but) not this part"
作为输入时,"this part (should be) (in the (result)), but"
应该是结果。
以下Elixir代码完成了这项工作。它遍历字符串,计算匹配的括号对,然后返回正确的子字符串。
问题是,它与我用命令式语言编写的代码完全相同。我想知道是否可能有不同的方式以不同的,更惯用的功能方式编写此代码。有人能帮我改进功能编程吗?
def get_content_of_first_pair_of_parentheses(s) do
cl = String.to_charlist(s)
first_opening = Enum.find_index(cl, fn(x) -> x == 40 end)
sub_cl = Enum.slice(cl, (first_opening+1)..-1)
content = Enum.reduce_while(sub_cl,
{[], 1},
fn(x, {list, counter} = acc) ->
if counter < 1 do
{:halt, acc}
else
case x do
40 -> {:cont, {[x | list], counter + 1}}
41 -> {:cont, {[x | list], counter - 1}}
_ -> {:cont, {[x | list], counter}}
end
end
end)
content
|> elem(0)
|> Enum.slice(1..-1)
|> Enum.reverse()
|> List.to_string()
end
编辑:这是我最终得到的代码。它与@dogbert建议的相同,但是以我发现更具可读性的方式进行了重组。
def get_content_of_first_pair_of_parentheses(s) do
subs = substring_after_first_opening_paren(s)
length_after_first_opening_paren = byte_size(subs)
length_after_matching_closing_paren =
subs
|> substring_after_matching_closing_paren(0)
|> byte_size()
binary_part(subs, 0, length_after_first_opening_paren - length_after_matching_closing_paren)
end
defp substring_after_first_opening_paren(<<"(", rest::binary>>), do: rest
defp substring_after_first_opening_paren(<<_, rest::binary>>), do: substring_after_first_opening_paren(rest)
defp substring_after_matching_closing_paren(<<")", _::binary>> = rest, 0), do: rest
defp substring_after_matching_closing_paren(<<")", rest::binary>>, n), do: substring_after_matching_closing_paren(rest, n - 1)
defp substring_after_matching_closing_paren(<<"(", rest::binary>>, n), do: substring_after_matching_closing_paren(rest, n + 1)
defp substring_after_matching_closing_paren(<<_::utf8, rest::binary>>, n), do: substring_after_matching_closing_paren(rest, n)
答案 0 :(得分:4)
以下是我在二进制文件上进行模式匹配的方法。在大多数情况下,创建一个charlist是相当低效的。以下代码不会创建任何新的二进制文件;它只是创建现有的子二进制文件,并在最后返回一个子二进制文件。
defmodule A do
def go(<<"(", rest::binary>>) do
remaining = byte_size(go(rest, 0))
binary_part(rest, 0, byte_size(rest) - remaining)
end
def go(<<_::utf8, rest::binary>>), do: go(rest)
def go(<<")", _::binary>> = rest, 0), do: rest
def go(<<")", rest::binary>>, n), do: go(rest, n - 1)
def go(<<"(", rest::binary>>, n), do: go(rest, n + 1)
def go(<<_::utf8, rest::binary>>, n), do: go(rest, n)
end
IO.puts A.go("Well (this part (should be) (in the (result)), but) not this part")
IO.puts A.go("(foo bar ())")
IO.puts A.go("(foo bar ()) baz")
IO.puts A.go("zz (foo bar ()) baz")
IO.puts A.go("foo (bar) baz)")
输出:
this part (should be) (in the (result)), but
foo bar ()
foo bar ()
foo bar ()
bar
逻辑与你的逻辑非常相似。首先,在找到左括号时跳过。然后,跟踪括号的嵌套级别,一旦找到右括号并且级别为0,则终止。
答案 1 :(得分:3)
您可以使用正则表达式来获得所需的输出:
str = "Well (this part (should be) (in the (result)), but) not this part"
Regex.run(~r/\((.*)\)/, str) |> Enum.at(1)
默认情况下,正则表达式模式贪婪地匹配,它会匹配从第一个paran到最后一个paran的所有内容
答案 2 :(得分:2)
这个解决方案类似于Dogbert的解决方案,但它是纯粹的流媒体:
defmodule A do
def go(input), do: go(input, 0, "")
def go(<<"(", rest::binary>>, 0, _), do: go(rest, 1, "")
def go(<<"(", rest::binary>>, num, acc), do: go(rest, num + 1, acc <> "(")
def go(<<")", _::binary>>, 1, acc), do: acc
def go(<<")", rest::binary>>, num, acc), do: go(rest, num - 1, acc <> ")")
def go(<<_::binary-size(1), rest::binary>>, 0, _), do: go(rest, 0, "")
def go(<<letter::binary-size(1), rest::binary>>, num, acc),
do: go(rest, num, acc <> letter)
def go(_, _, acc),
do: "⚑ Unmatched parentheses found. Accumulated so far: “#{acc}”"
end
IO.puts A.go("Well (this part (should be) (in the (result)), but) not this part")
IO.puts A.go("(foo bar ()")
产地:
this part (should be) (in the (result)), but
⚑ Unmatched parentheses found. Accumulated so far: “foo bar ()”