Question

我有一个使用'sed'从包含以下内容的文本文件中提取莫尔斯代码（短划线和句点）的作业

A test to see if the morse code can be removed from a file. .--- -. ..
This is a test --. -.- .-- .. -.. --- .- .. of sorts and so on. Let's see if the code snippets can be found.
Also can they be .- . -.- removed and yet leave the periods at the end
of sentences alone. ---- -. There are also hyphenated words like the
following: Edgar-Jones. -.

现在我可以使用sed删除所有字符[a-z]和[A-Z]，但问题是句子末尾的句点会被拾取，以及Edgar-Jones中的连字符。我也找不到办法把它们拿走......

感谢任何帮助，谢谢

感谢所有答案，每个人都很有帮助。这就是我选择的那个

sed "s/[a-zA-Z][-.]//g;s/[a-zA-Z: ']*//g" file

它找到一个短划线的实例或一个跟随一个角色的句点，然后删除那个我遇到麻烦的东西。然后它去清理其余的字符和空格以及冒号和撇号。

再次感谢！

Answer 1

这是一个可以解决此问题的awk。

awk '{for (i=1;i<=NF;i++) if ($i!~/[a-zA-Z0-9]/) printf "%s ",$i;print ""}' file
.--- -. ..
--. -.- .-- .. -.. --- .- ..
.- . -.-
---- -.
-.

此测试每个字段，如果包含a-z，则不打印它。

或正如格伦所评论的那样：

awk '{for (i=1;i<=NF;i++) if ($i~/^[.-]+$/) printf "%s ",$i;print ""}' file

Answer 2

sed 's/\(^\|[[:blank:]]\)[^[:blank:]]*[^-.[:blank:]][^[:blank:]]*/ /g' file

               .--- -. ..
     --. -.- .-- .. -.. --- .- ..              
     .- . -.-         
    ---- -.       
   -.

那个正则表达式是：

行的开头或空格
一些非空白字符
后跟一个不是空格或莫尔斯字符的字符
后跟一些非空格字符

这标识了至少包含一个非莫尔斯字符的单词，然后用一个空格替换它们。

使用GNU grep更简单，太糟糕了，你无法使用它：

grep -oP '(?<=^|\s)[.-]+(?=\s|$)' file

Answer 3

这个sed单行应该做的工作：

提取摩尔斯电码（短划线和句号）

在您的示例文件中：

sed "s/[a-zA-Z][-.]//g;s/[a-zA-Z: ']*//g" file

使用您的文件进行测试：

kent$  cat f1
A test to see if the morse code can be removed from a file. .--- -. ..
This is a test --. -.- .-- .. -.. --- .- .. of sorts and so on. Let's see if the code snippets can be found.
Also can they be .- . -.- removed and yet leave the periods at the end
of sentences alone. ---- -. There are also hyphenated words like the
following: Edgar-Jones. -.

kent$  sed "s/[a-zA-Z][-.]//g;s/[a-zA-Z: ']*//g" f1
.----...
--.-.-.--..-..---.-..
.-.-.-
-----.
-.

Answer 4

sed 's/\.$//
     s/\([^-[:space:].]\{1,\}[-.]\{0,1\}\)*//g
     s/\([[:space:]]\)\{2,\}/\1/g
     ' YourFile

用1
posix version

使用sed从文本文件中提取摩尔斯电码

4 个答案: