首先对不起,如果标题不是最清楚的,不太确定如何更好地表达问题。
基本上我将数据接收到一个bash脚本(我无法控制所述数据的格式),它以下列格式到达:
(Name: Foo bar; UUID: <blah-blah-0101>; AnotherField: Some text; TieredField: (Number: 123; Text: More Text; YetAnotherTier: (Name: somename; IP: 125.214.21.4) ; ) ; NumericalData: 4; MoreInfo: Some Information) ;
现在我要做的是循环每个键/值对,以便我可以处理信息。显然删除了前导/尾随“();”很简单。然后我想可能会替换“;”有换行符,但由于层次不同而中断。
关于层级,我并不关心在它们内部循环,我只关心最高级别。因此:
TieredField: (Number: 123; Text: More Text; YetAnotherTier: (Name: somename; IP: 125.214.21.4) ; )
就我而言,是一对简单的。
预期结果:
Name: Foo bar UUID: AnotherField: Some text TieredField: (Number: 123; Text: More Text; YetAnotherTier: (Name: somename; IP: 125.214.21.4) ; ) NumericalData: 4 MoreInfo: Some Information
由于我熟悉循环文本块的行,将原始字符串转换为上述结果就足够了,尽管直接循环遍历上述每一行的答案也可以。
不确定如何处理这个问题,所以任何方向都会受到赞赏。
答案 0 :(得分:2)
有效:
# strip stdin up until first '(' is read
cut -d '(' -f2- | while read -r -n1 c; do
case $c in
')') break; ;;
# if read any char, this is field name, just print it
[a-zA-Z]) echo -n "$c"; ;;
# doublescore separates names from values
:)
echo -n ': '
l=0
while read -n1 c; do
case "$c" in
# we need to count levels of '(' ')'
'(') ((l++)); echo -n '('; ;;
')') ((l--));
# if level gets under zero, break from here, look at `MoreInfo:` case
if ((l<0)); then
echo; break;
else
echo -n ')';
if ((l==0)); then
echo; break;
fi;
fi;
;;
# ';' separetes the next field, but only if level is zero, cause otherwise those are nested fields
';')
if ((l==0)); then
echo;
break;
else
echo -n "$c";
fi;
;;
*) echo -n "$c"; ;;
esac
done;
# if level is lower then zero, braek, look at `MoreInfo:` case
if ((l<0)); then break; fi;
;;
" ") ;;
esac
done;
cat >/dev/null
对于以下输入:
(Name: Foo bar; UUID: <blah-blah-0101>; AnotherField: Some text; TieredField: (Number: 123; Text: More Text; YetAnotherTier: (Name: somename; IP: 125.214.21.4) ; ) ; NumericalData: 4; MoreInfo: Some Information) ;
它产生输出:
Name: Foobar
UUID: <blah-blah-0101>
AnotherField: Sometext
TieredField: (Number:123;Text:MoreText;YetAnotherTier:(Name:somename;IP:125.214.21.4);)
NumericalData: 4
MoreInfo: SomeInformation
答案 1 :(得分:1)
这是一个
的脚本更多:
#!/bin/bash
WITHOUT_OUTER="`cat input.txt | cut -d"(" -f2- | rev | cut -d")" -f2- | rev`;"
PAIR=''
CNT=0
NEWLINE=0
OLD_IFS=$IFS
IFS=''
while read -n1 C
do
if [ "$C" == '(' ]
then
CNT=$((CNT+1))
elif [ "$C" == ')' ]
then
CNT=$((CNT-1))
fi
if [ $CNT -eq 0 ]
then
if [ "$C" == ';' ]
then
PAIR="$PAIR\n"
NEWLINE=1
fi
elif [ "$C" == ';' ]
then
PAIR="$PAIR$C"
fi
if [ "$C" != ";" ]
then
if [ ! $NEWLINE -eq 1 ]
then
PAIR="$PAIR$C"
else
NEWLINE=0
fi
fi
done < <(echo $WITHOUT_OUTER)
echo -e "$PAIR" > output.txt
格式化的值在output.txt中。 cat output.txt
会显示结果:
Name: Foo bar
UUID: <blah-blah-0101>
AnotherField: Some text
TieredField: (Number: 123; Text: More Text; YetAnotherTier: (Name: somename; IP: 125.214.21.4) ; )
NumericalData: 4
MoreInfo: Some Information
答案 2 :(得分:0)
这非常低效,但它会起作用 - 这个循环在将它们之间的任何东西打印为一个字符串之前查找第一个'('和最后')'(我还假设字符'_'未被使用.. 。):
t=''
n=0
oIFS=$IFS
IFS=';'
for f in $(sed -e 's/^(//' -e 's/) ;$//')
do
if [[ $f = *'('* ]]; then
t="${t}_ $f"
let n++
elif [[ $f = *')'* ]]; then
t="${t}_ $f"
let n--
[[ $n -eq '0' ]] && echo ${t##_ }
elif [[ $n -ne '0' ]]; then
t="${t}_ $f"
else
echo ${f## }
fi
done | IFS=$oIFS sed 's/_/;/g'
输出结果为:
Name: Foo bar
UUID: <blah-blah-0101>
AnotherField: Some text
TieredField: (Number: 123; Text: More Text; YetAnotherTier: (Name: somename; IP: 125.214.21.4) ; )
NumericalData: 4
MoreInfo: Some Information