Question

我对 awk 没有经验。我需要根据第二个文件中的字符串删除 yaml 文件中的行。 awk、bash 或 sed 是唯一可接受的解决方案；请不要使用 perl、python、ruby 或库。

我从以下匹配第一个模式开始，只跳过或丢失所有后续模式。

awk 'NR==FNR { pat[$0];next } NR>FNR { for (p in pat) if ($0 !~ p) {print;next} }' patterns yaml_file

输入文件是源文件的摘录。

模式字符串文件

#Remove the entire yaml line for the following
kubectl.kubernetes.io/last-applied-configuration
openshift.io/backup-registry-hostname

#remove the entire yaml block for the following
managedFields
status

和 yaml 文件

apiVersion: v1
items:
- apiVersion: apps.openshift.io/v1
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"apps.openshift.io/v1","kind":"DeploymentConfig","metadata":{"annotations":{"openshift.io/backup-registry-hostname":"docker-registry.default.svc:5000",,"spec":}
      openshift.io/backup-registry-hostname: docker-registry.default.svc:5000
  status:
    ingress:
    - conditions:
metadata:
  selfLink: ""

Answer 1

首先是一些一般的事情：

我（有点）抱歉，但第一个是 RTFM。有 this awesome doc for gawk，但 mawk's man page 也是一个很好的起点。
NR==FNR 只为真一次：第一个输入文件（模式）的最后一行，因为 NR 是所有文件的行号，FNR 是行号在当前文件中。

但由于任务很有趣而且我喜欢 awk，这是我的解决方案（对 mawk 和替代其函数集的风格有效，例如 gwak）。就个人而言，我会将模式拆分为两个文件（见下文），否则您必须在 awk 中拆分。这是可能的，但由于它基于注释行中的字符串匹配，这是一种糟糕的方法。

基于您输入的解决方案

您应该将 awk 源代码存储在一个文件中。它使调试更容易，许多编辑器提供突出显示。喜欢 reduce_yaml.awk 并用 awk -f reduce_yaml.awk patterns yaml_file

调用它

{
    if(FILENAME == "patterns"){
        if($0 ~ /^#.*block/) {
            inpat = "blocks"
        } else if($0 ~ /^#.*line/) {
            inpat = "lines"
        } else if($0 !~ /^#|^$/){
            if(inpat == "lines"){
                patterns_lines[$0] = 1
            } else if(inpat == "blocks"){
                patterns_blocks[$0] = 1
            }
        }
    } else {
        if(beginwith && substr($0, 1, length(beginwith)) != beginwith){
            beginwith = ""
        } else if(beginwith){
            next
        }

        for (p in patterns_lines){
            if ($0 ~ p) next
        }
        for (p in patterns_blocks){
            if ($0 ~ p) {
                beginwith = sprintf("%*s", match($0, /[^ ]/) + 1, "")
                next
            }
        }
        print
    }
}

使用拆分模式文件（`patterns_blocks` 和 `patterns_lines`）的解决方案

调用是 awk -f reduce_yaml.awk yaml_file，标题中提到的两个模式文件存在于同一目录中。

BEGIN {
    while ((getline line < "patterns_lines") > 0){
        if(line !~ /^#|^$/) patterns_lines[line] = 1
    }
    close("patterns_lines.txt")

    while ((getline line < "patterns_blocks") > 0){
        if(line !~ /^#|^$/)  patterns_blocks[line] = 1
    }
    close("patterns_blocks.txt")
}

{
    if(beginwith && substr($0, 1, length(beginwith)) != beginwith){
        beginwith = ""
    } else if(beginwith){
        next
    }

    for (p in patterns_lines){
        if ($0 ~ p) next
    }
    for (p in patterns_blocks){
        if ($0 ~ p) {
            beginwith = sprintf("%*s", match($0, /[^ ]/) + 1, "")
            next
        }
    }
    print
}

根据在另一个文件中找到的模式从 yaml 文件中删除行

1 个答案:

基于您输入的解决方案

使用拆分模式文件（`patterns_blocks` 和 `patterns_lines`）的解决方案

根据在另一个文件中找到的模式从 yaml 文件中删除行

1 个答案:

基于您输入的解决方案

使用拆分模式文件（patterns_blocks 和 patterns_lines）的解决方案

使用拆分模式文件（`patterns_blocks` 和 `patterns_lines`）的解决方案