Question

目标是一个脚本，该脚本逐行读取包含文件路径（Windows和Linux）的文件。它将剥离路径，仅保留带有扩展名的文件名。然后将文件名中的任何特殊字符替换为“ _”-下划线，最后将连续的下划线仅减少一个。就像st__a___ck变成st_a_ck。我可以使用它，但我相信这样做可能会有更好/更好的外观。我是一个非常初学者，仍然在学习思考Elixir /功能方式。我想要的是看到执行此操作的不同方法，以及一些改进和完善的方法。

测试样本：

c:\program files\mydir\mydir2\my&@Doc.doc 
c:\program files\mydir\mydir2\myD$oc2.doc\ 
c:\\program files\\mydir\\mydir2\\myD;'oc2.doc
c:\\program files\\mydir\mydir2\\my[Doc2.doc\\
/home/python/projects/files.py
/home/python/projects/files.py/
//home//python//projects//files.py
//home//python//projects//files.py//
c:\program files\mydir\mydir2\my!D#oc.doc 
c:\program files\mydir\mydir2\myDoc2.doc\ 
c:\\program files\\mydir\\mydir2\\my';Doc2.doc
c:\\program files\\mydir\mydir2\\myD&$%oc2.doc\\
/home/python/projects/f_)*iles.py
/home/python/projects/files.py/
//home//python//projects//fi=-les.py
//home//python//projects//fil !%es.py//
/home/python/projects/f_)* iles.py
/home/python/projects/fi les.py/
//home//python//projects//fii___kiii=- les.py 
//home//python//projects//ff###f!%#illfffl! %es.py//

代码：

defmodule Paths do

     def read_file(filename) do
         File.stream!(filename)
         |> Enum.map( &(String.replace(&1,"\\","/")) )
         |> Enum.map( &(String.trim(&1,"\n")) )
         |> Enum.map( &(String.trim(&1,"/")) )
         |> Enum.map( &(String.split(&1,"/")) )
         |> Enum.map( &(List.last(&1)) )
         |> Enum.map( &(String.split(&1,".")) )
         |> Enum.map( &(remove_special)/1 )
         |> Enum.map( &(print_name_and_suffix)/1 )

     end
     defp print_name_and_suffix(str) do
         [h|t] = str
         IO.puts "Name: #{h}\t suffix: #{t}\t: #{h}.#{t}"
     end
     defp remove_special(str) do
         [h|t] = str
         h = String.replace(h, ~r/[\W]/, "_")
         h = String.replace(h, ~r/_+/, "_")
         [h]++t
     end

end

Paths.read_file("test.txt")

任何见解都值得赞赏。

编辑：我对代码进行了一些重构。更多Elixir风格是哪个版本？

defmodule Paths do

     def read_file(filename) do
         File.stream!(filename)
         |> Enum.map( &(format_path)/1 )
         |> Enum.map( &(remove_special)/1 )
         |> Enum.map( &(print_name_and_suffix)/1 )

     end

     defp format_path(path) do
             path
             |> String.replace("\\","/")
             |> String.trim("\n")
             |> String.trim("/")
             |> String.trim("\\")
     end

     defp print_name_and_suffix(str) do
         [h|t] = str
         IO.puts "Name: #{h}\t suffix: #{t}\t: #{h}#{t}"
     end

     defp remove_special(str) do
         ext = Path.extname(str)
         filename = Path.basename(str)
             |> String.trim(ext)
             |> String.replace(~r/[\W]/, "_")
             |> String.replace( ~r/_+/, "_")

         [filename]++ext
     end

end

Paths.read_file("test.txt")

Answer 1

首先，我将指出代码的一般问题。

File.stream!/3产生一个Stream，明确设计为可以同时处理。将其传递给Enum.map/2具有零意义。使用Stream.map/2可以在与内核一样多的并发进程中处理文件。
格式化很重要。我们使用2个空格作为缩进。使用Elixir Formatter（或混合任务formatter）来格式化代码。
尽可能在函数头中直接分解（而不是直接defp print_name_and_suffix(str), do: [h|t] = str ...进行defp print_name_and_suffix([h|t])分解。
尽量减少字符串替换调用的次数，因为每个调用都需要使用单独的字符串传递来替换字符。
使用带有模式匹配的不同函数子句来简化寿命。
在适用的情况下尝试使用二进制模式匹配和递归。

话说回来，最妖精的方法是：

defmodule Paths do
  def read_file(filename) do
    filename
    |> File.stream!()
    |> Stream.map(&right_trim/1)
    |> Stream.map(&strip_path/1)
    |> Stream.map(&split_and_cleanup/1)
    |> Stream.map(&name_and_suffix/1)
    |> Enum.to_list()
  end

  defp right_trim(str), do: Regex.replace(~r/\W+\z/, str, "")

  defp strip_path(input, acc \\ "")
  defp strip_path("", acc), do: acc
  defp strip_path(<<"\\", rest :: binary>>, acc), do: strip_path(rest, "")
  defp strip_path(<<"/", rest :: binary>>, acc), do: strip_path(rest, "")
  defp strip_path(<<chr :: binary-size(1), rest :: binary>>, acc),
    do: strip_path(rest, acc <> chr)

  defp split_and_cleanup(str) do
    str
    |> String.split(".")
    |> Enum.map(&String.replace(&1, ~r/[_\W]+/, "_"))
  end

  defp name_and_suffix([file, ext]) do
    IO.puts "Name: #{file}\t suffix: .#{ext}\t: #{file}.#{ext}"
  end
end

Paths.read_file("/tmp/test.txt")

请主要注意strip_path/2函数，它会递归解析输入字符串，并在最后一个斜杠后向前或向后返回部分。我可以使用String.split/2或String模块中的任何内部函数，但是我明确地使用功能最强大的方法来实现它。

Elixir-如何改善代码和样式

1 个答案: