Question

在bash shell中，（对solaris 5.8的bash ver可能是旧的），使用awk或sed如何将行合并到一个＆＃34;重复＆＃34;之间的行中。图案：

[编辑：更好地解释自己]]

我的文件包含很多这样的条目：

my-group<--------------------------(main entry) <tab>group-code<spcaes>AXZ1<-------(sub-section under main entry) <tab>description <tab>state<spaces>CA <tab>items <tab><spaces>item_value_1 <tab><spaces>item_value_2 <tab><tab>header_3 <---------------(sub-section under sub-section) (can have upto 5th level) <tab><tab>header_3_item_1<spaces>vlaue

我希望将其转化为：每当行的第一列包含字母数字值时，新行开始。如果没有，则应附加为： -
所有TAB都替换为一个＆＃34; |＆＃34;和参数和值由＆＃34;：＆＃34;
分隔
my-group|group-code:AXZ1|description:|state:CA|items:something:something2|last-member-name:XYZ my-group|group-code:PORTU1|description:|state:CT|items:something:something2|last-member-name:FQRTZ

我该怎么做？我能想到的唯一方法是在内存中打开文件并逐行读取并执行操作。这是唯一的方法还是可以有sed / awk命令？

我在这里把bash代码放在这里我试图实现这个目标。（还没有工作）

#!/bin/bash myFile=$1 function trim () { local var=$@ var=$(echo $var|sed -e "s/^\s*//" -e "s/\s*$//" -e "s/[ \t]/:/g") echo -n "$var" } newLine='' i=0 while read line do i=$[i + 1] [ -z "$line" ] && continue if [[ $line =~ ^[[:alnum:]] ]] <-----this is not working....matching every line then newLine=$(trim "$line") match="matched ^a-zA-Z0-9" elif [[ $line =~ ^[[:space:]] ]] then line="$(trim "$line")" newLine="${newLine}|${line}" match="matched ^tab/space" fi echo -e "line number=$i match=$match line=$line new-ine value-->"$newLine"<--" echo done < $myFile

TY。

Answer 1

这可以通过以下sed脚本实现：

:a
N
s/\(\n\)    \([-a-z][-a-z]*\)/|\2\1/
s/\n  */:/
$!ta
s/:|/:/g
P
d

在您的输入上，它会产生预期的输出：

% sed -f script.sed data
my-group|group-code:AXZ1|description:|state:CA|items:something:something2|last-member-name:XYZ
my-group|group-code:PORTU1|description:|state:CT|items:something:something2|last-member-name:FQRTZ

其中script.sed包含上一个脚本。

<强>操作实例

:a       Label marking the start of our loop
N        Read next line of input
s/…/…/   If the structure matches a key:value declaration, translate it
$!ta      and return to a, to read the next key (unless we're at end of file)
s/:|/:/g Otherwise, clean the fields, 
P         print text gathered so far, 
d         and start a new cycle

请注意，我的sed在\n命令的替换文本中无法识别s，这就是我必须将其保存在组中的原因。

Answer 2

我正在玩弄GNU awk和-v RS=以及gensub()，但似乎与其他答案太相似了。

这是一个awk命令，通过使用-F"[[:space:]]*"使前导空格显着：

awk -F"[[:space:]]*" '
  NF==1 {if(b!="") print b; b=$1}
  NF==2 {b=b (b~/:$/?"":":") $2}
  NF==3 {b=b "|"$2":"$3}
  END {print b}' data

以下是演练：

当NF==1打印最后一个b或开始输出行b时
当NF==2抓住无标签字段并将其附加到带有先前标签的b时。使用三元运算符来决定何时添加“：”
当NF==3格式化键/值对并将其附加到b
在END中，打印存储在b

换句话说，逐行构建缓冲区，然后在遇到新记录时或END处输出缓冲区。

暂时，这是原始问题数据的副本：

my-group
    group-code                     AXZ1
    description
    state                          CA
    items
                                   something
                                    something2
    last-member-name             XYZ

my-group
    group-code                     PORTU1
    description
    state                          CT
    items
                                   something
                                    something2
    last-member-name             FQRTZ

Answer 3

我要感谢所有回答我最初问题的人。我会接受你的一个答案。

然而这就是我的成果，它运作良好。

#!/bin/bash
myfile=$1

function trim ()
{
    local var=$@
    var=$(echo "$var"|sed -e "s/^\s*//" -e "s/\s*$//" -e "s/[ \t]\{1,\}/:/")
   echo -n "$var"
}

newLine=''
i=0
linesInFile=$(wc -l $myfile|awk '{print $1}')
while IFS= read  line
do
    i=$[i + 1]
    [[ ! $line =~ [[:alnum:]\*] ]] && continue
    if [[ $line =~ ^[[:alnum:]] ]]; then
        if [[ $newLine != '' ]]; then
            echo $newLine
        fi
        newLine=$(trim "$line")
    elif [[ $line =~ ^[[:space:]]{4,} ]]; then
        newLine="${newLine}:$(trim "$line")"
    elif [[ $line =~ ^[[:space:]] ]]; then
        newLine="${newLine}|$(trim "$line")"
    fi
    if [[ $linesInFile -eq $i ]]; then
        echo $newLine
    fi
done < $myfile
IFS=$' \t\n'

在shell中如何将两个字符串之间的行组合成一行

3 个答案: