Question

以下是文件的内容：

xxx_component1-1.0-2-2acd314.xc-linux-x86-64-Release-devel.r
xxx_component2-3.0-1-fg3sdhd.xc-linux-x86-64-Release-devel.r
xxx_component3-1.0-2-3gsjcgd.xc-linux-x86-64-Release-devel.r
xxx_component4-0.0-2-2acd314.xc-linux-x86-64-Release-devel.r

我想提取组件名称component1 component2等。

这就是我的尝试：

for line in `sed -n -e '/^xxx-/p' $file`
do
    comp=`echo $line | sed  -e '/xxx-/,/[0-9]/p'`
    echo "comp - $comp"
done

我也试过sed -e 's/.*xxx-\(.*\)[^0-9].*/\1/'

这是基于网上的一些信息。请给我sed命令，如果可能的话，还要逐步解释

第2部分。我还需要从字符串中提取版本号。版本号以数字开头，以数字结尾。接下来是xc-linux。正如您所看到的那样，为了保持唯一性，它具有随机字母数字字符（长度为7）作为版本号的一部分。

例如：的 xxx_component1-1.0-2-2acd314.xc-Linux的X86-64-RELEASE-devel.r 在此字符串中，版本号为： 1.0-2-2acd314

Answer 1

有很多方法可以提取数据。最简单的形式是grep。

GNU `grep`：

您可以使用带有PCRE选项grep的GNU -P获取所需数据：

$ cat file
xxx_component1-1.0-2-2acd314.xc-linux-x86-64-Release-devel.r
xxx_component2-3.0-1-fg3sdhd.xc-linux-x86-64-Release-devel.r
xxx_component3-1.0-2-3gsjcgd.xc-linux-x86-64-Release-devel.r
xxx_component4-0.0-2-2acd314.xc-linux-x86-64-Release-devel.r

$ grep -oP '(?<=_)[^-]*' file
component1
component2
component3
component4

这里我们使用断言后面的负面看法来捕捉从_到-的所有内容，而不是煽动性的。

`awk`

$ awk -F"[_-]" '{print $2}' file
component1
component2
component3
component4

我们告诉awk使用-和_作为分隔符并打印第二列。

`sed`

话虽如此，您还可以使用sed使用组捕获来提取所需数据：

$ sed 's/.*_\([^-]*\)-.*/\1/' file
component1
component2
component3
component4

正则表达式声明匹配任何字符零次或多次直到_。从那时起，捕获所有内容直到组中的-。在替换部分中，我们只使用后引用（即\1）来调用组中捕获的数据。

在子字符串和字符串中第一次出现数字之间提取模式

1 个答案:

GNU `grep`：

`awk`

`sed`

在子字符串和字符串中第一次出现数字之间提取模式

1 个答案:

GNU grep：

awk

sed

GNU `grep`：

`awk`

`sed`