Question

我有一个包含以下文字的文件：

此内容/ media / news / section3 / S02 / basic / file.mp4的名称，然后是545756。
此内容/ media / news / section3 / S02 / file.mp4的名称，然后是42346。
此内容/ media / news / random3 / S02 / basic / file.mp4的名称，然后是543。
此内容/ media / news / random3 / S02 / basic / file.mp4的名称，然后是789。

我正在寻找摆脱“ - 这个/媒体/新闻/第3节的内容”或“ - 这个/媒体/新闻/随机3的内容”和“然后 * *号”。我想只留下“文件的名称.mp4” 有时，文件的名称也会像“Name.of.the.file.mp4”

一样打印出来

我尝试了不同的方式，但我只是一个初学者，它很快就会让人感到困惑，尤其是正斜率。任何帮助将不胜感激。

Answer 1

尝试：

 sed 's/.*\/\(.*mp4\).*/\1/' /path/to/your/file.txt

Answer 2

这不能直接回答你的问题，但无论如何它可能会满足你的需要：

如果这些是您所描述的计算机上的mp4个文件，您可以按如下方式获取文件的名称：

find /path/to/some/base/dir -type f -name "*.mp4" -exec basename {} \;

这将为mp4下的所有/path/to/some/base/dir文件提供文件名（不带前缀目录路径）。

如果这些文件实际上是您需要操作的文件中的行，则以下内容应该有效，尽管有点hacky：

awk 'BEGIN{FS="/"} {print $NF}' input_file.txt | awk '{$NF=$(NF-1)=""; print}'

Answer 3

假设您的文件名为files.txt，并假设您只对mp4文件感兴趣，那么以下sed命令应该可用于包含或不包含点的名称它们：

sed -i "s/^.*\/\(.*mp4\).*$/\1/g" files.txt

我命名了我的文件files.txt，这些是上面命令之前和之后的内容：

<强>之前：

Content-of this /media/news/section3/S02/basic/Name of the file.mp4 then 545756.
Content-of this /media/news/section3/S02/Name of the file.mp4 then 42346.
Content-of this /media/news/random3/S02/basic/Name.of.the.file.mp4 then 543.
Content-of this /media/news/random3/S02/basic/Name of the file.mp4 then 789.

<强>后：

Name of the file.mp4
Name of the file.mp4
Name.of.the.file.mp4
Name of the file.mp4

Answer 4

另一种解决方案：

awk '{gsub(/[^.]*\//,""); for(i=1;i<=NF-2;i++) {printf "%s ", $i} print ""}' file

Answer 5

不需要awk或sed。您只需使用grep：

即可

grep -o "[^/]*\.mp4" file

说明：

-o, --only-matching
       Print only the matched (non-empty) parts of a matching line, with each
       such part on a separate output line.

[^/]*   Match anything not a forward slash any number of times

\.mp4   Remember to escape the dot metacharacter.

Answer 6

为了避免与正斜杠混淆，有助于知道sed的s命令没有绑定到/：s命令的通常形式是{{1你可以用其他字符替换正斜杠，例如s/pattern/replacement/。所以，改写一下 @ adayzdone的回答，你可以写：

s,pattern,replacement,

Sed模式与正斜杠

6 个答案: